Internet Search Plant-level data from the Internet was collected in a threefold manner. First, we searched for information using accessible public records. Second, we searched for any relevant articles available in Polish newspapers as well as for information on a given company’s official website. Last, we used relevant keywords in Google searches to find any additional information.
- In the beginning of our search, we found the appropriate identification numbers for a given firm, namely the REGON, NIP, and KRS. Using anyone of these three IDs, we were able to search for basic information about the firm (e.g. whether it is still active, and if not, when it stopped functioning) by inputting them into the engine on the Central Statistical Office’s website (Wyszukiwarka GUS) and, additionally, into third-party sites offering digital recordkeeping services (KRS-Online.com.pl and IMSiG). Information provided by these websites included the date of formation, date of liquidation, and the date of removal from the registry. Typically, a state-owned enterprise retained its ID when it was restructured and privatized. However, in some instances when the newly formed private company was not a direct continuation of the former SOE, it was registered as a completely new entity. This would, for example, happen when the new investor of the firm changed the business and production profile of the company. At the same time, a privatized SOE could be recognized as a completely new firm if other parts of its capital were sold off or liquidated. Since our data is plant level, we considered a newly established firm to be a continuation of an SOE when the plant specific to our database remained functioning following the restructuring.
- If not all of the desired data could be attained through information linked to the aforementioned identification numbers, we searched for websites linked with the specific plant, firm or industry. Most existing firms with a websites feature some sort of “about us” section with relevant historical information. The detail with which the history of an existing plant was explained on a company website varied considerably, and therefore could not be taken as the sole means of identification for surviving plants. Further sources, mostly national and local newspapers, had to be used to gain deeper insight into the current status of some of the plants. Newspaper and media sources additionally contributed to information on liquidation and changes in ownership. For plants that were liquidated, restructured or privatized in the 1990’s and early 2000’s, we searched the digital archive of the Rzeczpospolita Newspaper (archiwum.rp.pl). Additionally, we searched for articles on gazeta.pl, wp.pl and naszemiasto.pl as well as appropriate local newspapers (whenever their digital archive was available). The local newspapers searched included: Dziennik Bałtycki, Dziennik Wschodni, Gazeta Lubuska, Głos Wielkopolski, as well as many other regional weeklies (in Polish: “Tygodnik”).
- When the two previously mentioned methods failed to yield the information we desired, we did a basic Google search on the given plant/firm. The specific search terms were:
	- For matters of privatization: plant name* + “prywatyzacja” OR “podział” OR “inwestor” OR “FDI” OR “zagraniczny inwestor” OR “spółka akcyjna” OR “spółka skarbu państwa” OR “akcje” OR “sprzedaż”: (translated search terms: “privatization”, “division”, “investor”, “foreign investor”, “joint-stock company”, “treasury-owned enterprise”, “stock” and “sale”)
- For matters of bankruptcy: plant name* + “upadłość” OR “likwidacja” OR “podział” OR “nierentowność” OR “sprzedaż mienia” OR “sprzedaż gruntu” OR “zamknięcie” (translated search terms: : “bankruptcy”, “liquidation”, “split”, “unprofitable”, “asset sale”, “property sale” and “closing”).
 
- To ensure reliability of the information provided in the search results, we sought to find multiple sources that verified given facts about a plant. Only when a certain date or event was reported by two or more sources was it included in our dataset. These sources had to be independent of each other. In that, two websites that cited the same source could not be taken as two independent sources. Thus, only primary sources were considered factual.
Court Archives We collected data from the Regional Commercial Court in Warsaw (pol. “Sąd Okręgowy w Warszawie – Sąd Gospodarczy”). The registry of public firms (RPP- pl. Rejestr Przedsięborstw Państwowych) found in the court was used for information about the potential privatization or liquidation of state-owned enterprises. The records available in several regional courts were queried, but the procedure was not carried out for all regional courts because of its inefficiency. First, most of the information from the RPP is also available online in alternative sources (RPP holds the source documents, but the status has been recorded in a number of digital registries). Second, a large number of privatizations involved at initial stage a transformation from a state enterprise to a treasury-owned company (in Polish: jednoosobowa spółka skarbu państwa). Given that our interest encompasses the complete privatization process, RPP records conclude too early for many plants. Third, RPP contains information from a firm level, whereas our data are available at plant level.
Business Intelligence Following all of the above steps, we tried to fill in the gaps of our dataset by purchasing information from two business intelligence firms – a domestic and international one – whether they could find any information regarding the status, transformation, fate, employment or sales, of some of the plants for which we could not find information. Both of the publishers informed us they were unable to find any relevant information.
 
        



