Sample of Dutch FADN 2006 : design principles and quality of the sample of agricultural and horticultural holdings

The mission of Wageningen UR (University & Research centre) is ‘To explore the potential of nature to improve the quality of life’. Within Wageningen UR, nine specialised research institutes of the DLO Foundation have joined forces with Wageningen University to help answer the most important questions in the domain of healthy food and living environment. With approximately 30 locations, 6,000 members of staff and 9,000 students, Wageningen UR is one of the leading organisations in its domain worldwide. The integral approach to problems and the cooperation between the various disciplines are at the heart of the unique Wageningen Approach. R.W. van der Meer, H.B. van der Veen and H.C.J. Vrolijk Sample of Dutch FADN 2011 LEI Wageningen URP.O. Box 29703 2502 LS Den Haag The Netherlands E publicatie.lei@wur.nl www.wageningenUR.nl/lei

The EU Farm Accountancy Data Network (FADN) requires the Netherlands to yearly send bookkeeping data of 1,500 farms to Brussels. This task is carried out by LEI and CEI. The data sent to Brussels mainly involves technical and fiĉ nancialĉeconomic information. For national policy purposes additional data is colĉ lected, such as pesticide use, manure production, nature management, nonĉ farm income and rural development. This report explains the background of the farm sample for the year 2006. The report mainly focuses on the Dutch contriĉ bution to the European Farm Accountancy Data Network. All phases from the determination of the selection plan, the recruitment of farms to the quality conĉ trol of the final sample are described in this report.

Introduction
The EU Farm Accountancy Data Network (FADN) requires the Netherlands to yearly send bookkeeping data for 1,500 farms to Brussels. This task is carried out by the Agricultural Economics Research Institute (LEI) and the Center for Economic Information (CEI). The legislation of the FADN demands that the memĉ ber states prepare a selection plan and a report on the results of the selection. This report fulfils this obligation. Furthermore, the report gives an analysis of the quality of the sample.

Population and Selection plan 2006
The population (field of survey) of the FADN is defined as all farms above the threshold of 16 European Size Units (ESU). In the Netherlands farms between 16 and 1,200 ESU are included in the population (table 3.1). A stratified random sample is drawn, in which economic farm size and type of farming are used as stratification variables. The scheme for the types of farming is based on a Dutch version of the Common Agricultural Typology that is also used by EUROSTAT. The total agricultural population contains 79,435 farms according to the agricultural census. The field of survey contains 60,353 farms. These farms cover an imporĉ tant part (87%) of the production capacity (table 3.1). In the selection plan, LEI planned to select 1,500 farms for the 2006 accounting year. The last few years, a lower number of farms were submitted to Brussels due to capacity problems, but in 2006 more than the requirement of at least 1,500 farms has been fulfilled.

Result of recruitment and quality of the 2006 sample
For 2006, 1,506 farms were included in the sample and were delivered to Brussels (table 5.8). Chapter 6 gives a quantitative evaluation of the resulting sample. A comparison of the field of survey with the total agricultural population shows that 23% of the farms are below the lower threshold. These farms are responsible for a small percentage of production only. The sample results in a coverage of 90% of the production for most of the agricultural activities. In horĉ ticulture, part of the production is not covered because it takes place on farms above the upper threshold. Therefore the upper threshold has been increased to 2,000 ESU. This increase has been introduced as a trial in 2006 and has been integrated in the selection plan starting from the year 2007. There are 140 firms larger than 2,000 ESU. Table 6.2 gives a description of the coverage of a large number of activities. Table 6.3 shows the relationship between types of farming and agricultural activities. The numbers show that only a limited perĉ centage of pigs is produced on specialised pig farms, while at the other exĉ treme almost all mushrooms are produced on specialised mushroom farms. Two important aspects of a sample, the representativeness of the sample and the reliability of estimates, are evaluated in section 6.3.3 and 6.3.4. Table 6.4 evaluates for many variables whether there is a difference between the agriculĉ tural census and the estimate based on the FADN sample. These tables provide useful information for specific research projects enabling the researcher to deĉ termine whether the sample is representative for his or her topic.

Objective of the report
In 1965 the European Commission adopted a regulation (nr. 79/65/EEG) in which member states were obliged to set up a network for the collection of acĉ countancy data on the incomes and business operation of agricultural holdings in the European Economic Community. The purpose of the data network is deĉ fined as the annual determination of incomes on agricultural holdings, and a business analysis of agricultural holdings. The Netherlands were required to provide financial economic information on 1,500 farms to Brussels. For the management of the system, the EU requires information on the seĉ lection of farms that are included in the national FADN systems. In particular the regulation prescribes the provision of data on the establishment of a selection plan and the recruitment of farms.
With respect to the selection plan the regulation EEG 1859/82 prescribes (article 6): 'Each Member State shall appoint a liaison agency whose duties shall be: …to draw up and submit to the National Committee for its approval, and thereafter to forward to the Commission: ĉ the plan for the selection of returning holdings, which plan shall be drawn up on the basis of the most recent statistical data, presented in accorĉ dance with the Community typology of agricultural holdings, ĉ the report on the implementation of the plan for the selection of returning holdings.' This report provides all the relevant background information on the populaĉ tion, the selection plan, implementation of the selection plan and quality of the sample of data that it to be provided to Brussels and which forms the basis for a wide range of national research projects. 12

Structure of the report
Chapter 2 gives a description of the background of the Dutch FADN system. Chapter 3 describes the agricultural population in the year 2006. This chapter will also consider the demarcation of the population as used in the Dutch FADN. Also the design of the sample of the Dutch FADN system is described. Chapĉ ter 4 reports on the selection plan 2006. Chapter 5 provides information on the implementation of the selection plan and the recruitment of new farms. Chapĉ ter 6 provides a qualitative and quantitative evaluation of the sample 2006.
2 Statistical background of the Dutch FADN sample

Introduction
In the Dutch FADN detailed records on 1,500 agricultural and horticultural farms are kept. Besides financialĉeconomic information, a broad set of technicalĉ economic, socioĉeconomic and environmentalĉeconomic data is collected. One of the reasons for the Dutch FADN system is the legal obligation to provide inĉ formation on the financial economic situation of farms to Brussels. However, an even more important use of the data can be found at the national level. Data from the FADN system is used for many national policy evaluations and research projects.
Based on a sample of farms estimations are made for the whole population. This might raise the question how conclusions can be drawn for the whole popuĉ lation if only a limited number of farms are observed. The answer to this quesĉ tion can be found in the selection of farms that are included in the sample. A cook also doesn't eat all the soup to judge the quality of the soup. It is imporĉ tant to stir well before tasting; the spoon of soup should reflect all flavours in the pan of soup. The spoon of soup should be representative of the whole pan of soup. The same is true for the FADN sample. The farms that are included in the FADN should be representative of the whole population. In this way a sample can provide better information than a census (in which all units are observed). With a fixed budget it is much easier to collect good data on a limited number of farms instead of collecting information on all farms. With a limited number of farms and thus a limited number of data collectors, it is easier to ensure good procedures and good training to collect reliable data.
An important issue is how to ensure that the farms that are included in the FADN sample are representative of the whole population. Use is made of a disĉ proportional stratified random sample. A stratified sample implies that the popuĉ lation is divided into a number of groups. Subsequently farms are selected from each of the groups. The variables on which the groups are defined should be relevant variables to make sure that the farms that are included in one group are similar (at least in the important aspects). Using this stratification, and seĉ lecting farms from each group, ensures that farms from all groups and conseĉ quently with different characteristics are included in the sample.
Disproportional means that not all farms have the same chance of being inĉ cluded in the sample. Groups which are relatively homogeneous, i.e. farms which show large similarities, have a lower chance of being included in the samĉ ple. After all, if all the farms are very similar, a limited number of observations are enough to draw reliable conclusions (in the extreme case that all farms are exactly identical, it would be enough to have only one observation). In case of less homogeneous groups it is important to have a larger number of observaĉ tions to make reliable estimates.
The choice of the stratification variables has therefore an important impact on the representativeness of the sample.
This way of selecting farms make it possible to make unbiased estimates for the whole population of farms. Based on the sample farms in a certain group, estimations can be made for all the farms in that group. Stratification assures that farms are selected from all groups, thereby allowing estimations for all groups. All groups together make up the whole population. In the Dutch FADN this is achieved by assigning a weight to each sample farm. The weight is calcuĉ lated by dividing the number of population farms in a group by the number of sample farms in this same group.
Stratification also improves the representativeness in case of nonĉresponse. If a farm which is asked to join the FADN system refuses, another farm in the same size class and of the same type of farming can be selected. If there is a difference between the selection plan and the actual implementation, stratificaĉ tion helps to improve the representativeness by taking into account the real sampling fraction.
Finally, stratification makes the maintenance of the sample easier. Due to atĉ trition and changes in the population it is sometimes necessary to supplement certain groups. Stratification makes a more focused replacement possible.
The relationship between the agricultural population and the FADN sample is presented in figure 2.1. The agricultural census provides an almost complete description of the agricultural population. Part of this census or part of this population is defined as the field on observation in the FADN. In the definition of the field of observation a lower threshold and an upper threshold are applied. Furthermore, an additional criterion on the share of agricultural income in total income is used. These criteria will be further discussed.

Lower threshold
The lower threshold of 16 ESU has been used for a long period of time. It is specified in the legislation underlying the FADN. The historical background was to distinguish small farms which were only held as a hobby or as side activity from real commercial farms producing for the market. Although the number of farms excluded from the field of survey is quite substantial the percentage of production value which is not covered due to this threshold is very limited.

Upper threshold
The upper threshold was introduced to exclude some nonĉagricultural organisaĉ tions from the field of observation. The agricultural census contains some orĉ ganisations with a lot of land but which are not considered as agricultural holdings (examples are airports, nature organisations and in earlier days organiĉ sations which managed the reclamation of land from water bodies). In order not to judge each individual holding, an upper threshold was introduced to exclude these from the field of survey. Due to the growth in size of farming in especially horticulture it was decided to increase the upper threshold in order to fulfil the requirement to cover at least 90% of the agricultural productivity. At the current moment a project is being undertaken to assess whether farms above the threshold can be included in the sample in the future. Issues to be addressed will be: are large farms willing to cooperate, how can they be moĉ tivated, is the farm comparison report useful for them, how much resources will it take to administer these farms etc. Based on the results of this project a deĉ cision will be made whether the upper limit will be maintained in the future.

Other income sources
For practical and methodological reasons a limitation on other income of the holding is used. In earlier times the rules were not clearly specified. Firms with a high share of other income sources were excluded from the sample because of practical reasons such as the impossibility to allocate costs and revenues to difĉ ferent activities, firms would refuse to participate anyway because they cannot be motivated to participate etc. Recently clear rules have been specified whether a firm belongs to the field of observation or not. A firm should have at least 16 ESU from primary agricultural activities, at least 25% of the turnover should come from primary agricultural activities and agricultural activities ĉ in the broadest sense, so as to include other gainful activities ĉ should be the largest share of turnover of the holding.

Stratification criteria
Given these three criteria the field of observation of the FADN system is defined. Within this field of observation a stratification scheme is used. The stratification of the Dutch FADN is based on size of farming and type of farming. Although these criteria are similar to those used by the commission, a more detailed look reveals substantial differences with the EU stratification. Differences are for exĉ ample the use of separate strata for organic farming, and in several types of farming more detailed subtypes of farming are specified which are relevant for Dutch Agriculture (for example starch potato farms, flower bulb farms, horticulĉ tural farms by type of production).
The Dutch situation it is somewhat more complicated due to the fact that the size classes are different within different types of farming. The size distribution of, for example, horticultural farms is completely different than the size distribuĉ tion of arable farms. To take these differences into account the borders of the size classes have been established for each type of farming separately. Despite this complication the strata are still a cross section between types of farming and sizeĉclasses. In total 87 strata have been defined. Figure 2.2 presents an overview of the sampling and recruitment processes. The agricultural census from Statistics Netherlands (CBS) is the starting point for the random sampling of farms. The random sampling takes place based on the selection plan as submitted to the European Commission. The selection plan will be further described in chapter 4. Based on the selection plan farms from the agricultural census are randomly drawn. This census (as available to reĉ searchers) does not contain addresses but only farm identifiers. These farm identifiers are sent to the ministry and the ministry returns the addresses. These addresses are forwarded to the regional offices who are responsible for conĉ tacting farmers to request their participation. The farmers either refuse or acĉ cept the request to participate; this recruitment process and the nonĉresponse will be described in chapter 5. The regional offices collect the authorisations and forward them to the central office in The Hague. These authorisations are used to receive electronically available information from banks, suppliers, govĉ ernment and others. The information on the acceptance and refusal of farmers is also used to verify the quality of the sample (see chapter 6).

Introduction
This chapter will describe the population or, more precisely, the field of obserĉ vation as covered by the FADN sample. A lower threshold is used to define the field of observation. This threshold and the consequences of this threshold will be described in section 3.2. Section 3.3 describes the strata which are used to subdivide the population. Section 3.4 reports the number of farms in each of the strata.

Defining the field of observation
Collecting detailed information at farm level requires considerable time and money. To assure an efficient and effective allocation of the available budget, the sample design focuses on certain groups in the population (demarcation of the population). Given limited capacity it is important to apply a sampling procedure that optimises the reliability of the sample estimates (through stratification).
Regulation 1859/82 of the EU Commission (adapted by regulation EEG no. 3548/85) defines the population (field of observation) for the Dutch FADN as those farms with a size of more than 16 European size units (ESU). Until 2001 this threshold was translated into 16 Dutch size units (DSU), which is roughly similar to 18.7 ESU. For the statistical use of the data and the comparability of results it was considered advisable to apply the ESU threshold. Therefore the lower limit of the Dutch FADN system has been 16 ESU since the year 2001.
In addition to a lower threshold there is also an upper threshold. This upper threshold has been adjusted every few years to take into account the growth of the average size of farms. Until 2001 the upper threshold was 800 DSU. In 2001 the upper threshold was raised to 1,200 ESU. The percentage of farms and the agricultural output excluded due to this upper threshold has been growĉ ing since 2001. For this reason the upper threshold has been increased again to 2,000 ESU. This increase has been introduced on a trial basis in 2006 and has been inĉ tegrated in the sample and weighting scheme starting from the year 2007. In this report most of the analyses presented still focus on the upper threshold of 1,200 ESU. In 2006, 449 farms were excluded from the field of observation because of the upper threshold of 1,200 ESU (140 farms above 2,000 ESU). These farms were responsible for 12.8% of the total production (5.26% for farms larger than 2,000 ESU). Due to the lower threshold 18,663 farms were not covered by the FADN sample. Although this is a large number of farms, they are only responsible for 1.93% of the total production capacity. The number of farms and the share of economic production of these farms have slightly deĉ creased compared to 2005. The population (field of observation) of the Dutch contribution to the EU FADN system is displayed in table 3.1.

Design of the stratification scheme
Farms are allocated to strata according to the following stratification variables: type of farming and size class. In the past a more detailed stratification scheme was used, but this resulted in numerous practical problems due to empty or nearly empty cells. Combining cells can easily lead to a distortion in the calcuĉ lated results (a bias). Farms of a certain type of farming are divided into 3 size classes. In the past 4 size classes were used. The reduction of size classes can be explained by the problem of empty or nearly empty cells and the conclusion that a fourth size class only provided a very limited value in increasing the effiĉ ciency of the estimators .
In total 29 types of farming are distinguished (see table 3.2). For a number of types of farming a distinction is made between organic farming and nonĉ organic farming. A compromise was found to fulfil the increasing demand for research on organic farms. Random selection of organic farms from the total population would result in a very low number of observations because of the low proportion of organic farms. The definition of separate strata would result in many practical problems. The number of strata would double. The problem of empty or nearly empty strata would increase seriously. In line with the existing stratification, a number of types of farming were selected where organic farmĉ ing is especially relevant. The types that were originally selected were: field crop farms, dairy farms, field vegetables and combined crop farms . The growth in the organic sector was however lower than exĉ pected and aimed for by policy makers. This resulted in practical problems in the recruitment of organic farms, for example due to the fact that the number of farms according to the selection plan was close to or even higher than the acĉ tual number of farms in the population. To deal with this problem a number of organic strata have been combined. Organic field crops farms, field vegetables and combined crop farms have been integrated in one stratum organic crop farms (Vrolijk, 2006).
The breakdown in subtypes is as follows: field crop farms have been itemĉ ised in starch potato farms, organic crops and all other field crop farms. The vegetables under glass farms have been broken down in paprika, cucumber, tomato and other. Cut flowers under glass are divided into roses, chrysantheĉ mums and other cut flowers. The dairy farms are split into organic and nonĉ organic dairy farms. Within field vegetables and the combined crop farms the organic farms have been separated. These are subsequently combined with the organic field crop farms.
The final stratification and the size thresholds for each of the strata are disĉ played in table 3.2. The thresholds were determined by optimal stratification in 2000 (see Vrolijk and Lodder, 2002) and have remained unchanged since then. The strata will be reconsidered again in the shift to Standard Outputs.     This table shows that 60,353 farms fall within the field of observation. Dairy farms are clearly the largest group of farms. Almost one in every three farms is classified as a dairy farm.

Introduction
The allocation of the total capacity of sample farms is based on the relative imĉ portance and the heterogeneity of the different types of farming (see Dijk et al., 1995a and. Within each type of farming an optimal stratification (determination of thresholds of size classes) and optimal allocation is applied (distribution of sample capacity over the different size classes).

2006 selection plan
The EU regulation prescribes the use of size class and type of farming as imĉ portant variables in the stratification and the choice of farms. Due to differences in the exact stratification scheme it is necessary to take into consideration the different weights of farms in different strata (Dijk et al., 1995b).
The design principles of the sample of the FADN system facilitate an efficient alignment with the goals of the system (see chapter 2). A summary of the 2006 selection plan is provided in table 4.1. Given the goals of the FADN system the numbers provided in the table are the required number of observations per type of farming.
For the sample of 2006 a few changes have been made. The number of farms specialised in field vegetables has been reduced. This capacity was needed to fulfil the request to include large farms into the sample. The upper limit of the sample has been increased to 2,000 ESU to solve the problem of an increasing share of agricultural production not covered by the sample. Thereĉ fore the number of sample farms in a number of farm types, those with a subĉ stantial share above 1,200 ESU, has been increased. This concerns the glassĉ houses (vegetables as well as flowers), plant growers, mushroom growers and bulb growers. These are the types of farms where the share of production above the upper limit increased substantially during the last years.  The

Recruitment of farms
Based on the available number of farms in the FADN sample and the expected number of farms ending their participation before or during 2006 an estimate was made of the number of farms to be recruited. Furthermore, the variant of bookkeeping has been explicitly considered. An evaluation has been made of the policy and research relevance of sectors and based on this importance a decision has been made whether a type of farming is assigned to the EU variĉ ant, the corporate social performance (CSP) variant or a combination of both. This implied that some farms had to be switched to the other variant. In some cases this would result in the dropĉout of the farm. This has been taken into consideration in the number of farms to be recruited.  Based on the number of farms to be recruited, as displayed in table 5.2, farms were randomly selected from the 2004 agricultural census. The random draw of farms took place per stratum. The number of drawn farms per stratum was 7 times higher than the required number of farms to ensure enough addresĉ ses, even with a high nonĉresponse rate in specific types of farming. The adĉ dresses were requested from an agency (Dienst Regelingen) of the Ministry of Agriculture. The farm identifiers of the randomly selected farms were sent to the Ministry who sent back the addresses of these farms (under the strict condition that this information was only used for the recruitment of farms for the FADN). Using these addresses farms were contacted and asked to participate in the FADN.
Farms are asked to participate in the system in order to compensate for attriĉ tion and to take structural changes in agriculture into account. Some of the farms approached during the recruitment phase refused to participate. These refusals do not cause problems if these farms do not differ from farms that participate in their place. In the case where farms that refuse to participate systematically differ from the participating farms, this could result in a bias. If for example older farmĉ ers are less inclined to participate, this will result in a different age distribution in 32 the sample compared to the population. The representativeness of the data with respect to age will be called into question ĉ whether this is a problem or not deĉ pends on the research goals and the extent to which the important variables corĉ relate with age. The representativeness is analysed in chapter 6. Table 5.3 describes the response rate in the different types of farming. This table only inĉ cludes those farms which were asked to participate in the CSP variant. This variĉ ant will be explained in more detail at the end of this section.    To develop a better understanding of the reasons for nonĉresponse a numĉ ber of questions were asked to all farmers approached. Table 5.4 shows the results for the questions asked. In these questions the farmer had to indicate to which extent he/she agrees with a statement about his knowledge or his attiĉ tude. The table shows a clear difference between those farmers who are willing to cooperate and those who are not. The ones who are willing to participate are more informed about the activities of LEI and the use of FADN data. Providing data and the FADN system is considered more useful by those who are willing to participate. The opinion about LEI with respect to objectivity and carefulness is better among the participants. The last question shows that nonĉparticipants have a significantly lower trust in the government. Using these same variables discriminant analysis was applied to find the facĉ tors that are most discriminating between farmers who are willing to participate and farmers who refuse to participate. The analyses of the attitude of farmers shows that 'usefulness of FADN system', 'usefulness of providing data' are the most important factors in predicting the participation of an individual farmer. This is a similar result compared to the previous recruitment (Vrolijk et al., 2008). Table 5.5 describes the number of farms where accounts were completed for the first time for the bookkeeping year 2006. Due to several factors this is not exactly the same as the number of farms recruited. First, farms can drop out during the first year of participation. Second, some farms were already reĉ cruited during a previous year, but due to capacity problems their bookkeeping was not completed for that year.    In table 5.6 a distinction is made between CSP observations (corporate soĉ cial performance) and the total number of observations. Poppe (2004) deĉ scribes that the introduction of a new bookkeeping system and budget cuts have resulted in a large pressure on available capacity. To deal with this presĉ sure, a flexible data collection system has been introduced with two main variĉ ants in the data collection: the EU variant and the CSP variant. In the EU farmĉ income variant the most essential financial economic information is collected. This is the information that each member state is obliged to provide to Brussels. The information covered in this variant mainly focuses on family farm income, the balance sheet, a limited number of technical data (cropping pattern, liveĉ stock) and information on the EU subsidies. In the second variant, the CSP variĉ ant, a wide range of data is collected for EU and national purposes. It covers all the topics that are nowadays considered relevant in a report on the corporate social performance of a company or a farm. Therefore, besides the financial economic information as collected in the EU variant, a wide range of data is colĉ lected such as environmental data, other farm incomes, offĉfarm income, animal welfare, animal health and the level of innovation of firms.

Supply of 2006 farm results to the European Commission
The final delivery of 2006 data to the EU has taken place in December 2007. Data of 1,506 farms have been provided to Brussels (table 5.7). This is the highest number of farms since many years and it fulfils the obligation of 1,500 farms. 6 Evaluation of 2006 sample

Introduction
In this chapter the FADN sample for the year 2006 is evaluated in a qualitative and quantitative way. Section 6.2 provides an evaluation of the methodology of stratification and weighting. A crucial element is the calculation of weights. Secĉ tion 6.3 provides the quantitative evaluation of the year 2006. This section foĉ cuses on the quality of the estimations that can be made based on the sample.

Introduction
This section deals with some practical problems related to the estimation procĉ ess. Weights of individual farms are used to make estimations of frequencies, totals and averages of groups of farms (aggregated results) based on the data from the agricultural census and the FADN data. The method to calculate the weights of individual farms is crucial. The goal is to achieve unbiased estimates with a minimal variance. This enables the estiĉ mation of the confidence interval of the real population value and the minimisaĉ tion of the total error. This is true for direct estimators. In the case of a ratio estimator this is not necessarily true, but ratio estimators are outside the scope of this publication (see Vrolijk et al., 2001) for a more extensive description of ratio estimators and other estimators).
In the next section the method to calculate the weights of the farms is deĉ scribed in general terms. The method applied to calculate the weights is evaluĉ ated from a practical and theoretical perspective.

Method of calculation of weights
The objective of the Dutch FADN system is to give a representative view of the total population. The question is therefore how to draw conclusions on totals, averages and frequencies that are valid for the whole population based on indiĉ vidual farm data. For example, how much is the average family farm income of all farms in agriculture and horticulture? The solution is found in weighting: the individual farm data are raised to the population level (for some variables the esĉ timated values can be compared to the data that is available for the whole popuĉ lation, i.e. data which are included in the yearly agricultural census). A weight is assigned to every observed farm in the FADN system. The weight is defined as the ratio between the number of farms in a stratum according the agricultural census and the number of farms in the sample (in the FADN system). For the assignment of farms in the FADN system to strata the information from the year 2006 is used. These data can be different from the data when the farm was chosen in the system for the first time. This implies some kind of postĉ stratification. Weights can be calculated as soon as a substantial number of farms have been completed. During the year, when additional farms are comĉ pleted, the weights are recalculated. The weights of the farms are recalculated until the accounts of all farms are completed and the final set of weights can be established. For preliminary estimations based on for example 50% of the farms, one should be aware of the fact that this 50% is not necessary represenĉ tative for the whole population.
The (post) stratification of the farms is based on the 2006 agricultural cenĉ sus. The population in a specific stratum is continuously changing. Therefore the farms that belong to a stratum in 2005 are not exactly the same as the farms that belong to that stratum in 2006. Due to these changes farms included in one stratum could have had different inclusion probabilities at the time of reĉ cruitment. In theory, to achieve unbiased estimators these differences in incluĉ sion probabilities should be taken into account in the estimation process. However, the consequence of this would be a very complicated system with many different substrata with different inclusion probabilities. Therefore this complicated procedure is not applied. As a result, the theoretical assumption of a strict aĉselect sample can not be validated.
Although the calculation method applied in practice can lead to systematic distortions between estimated values and real values, the assumption of a ranĉ dom sample is made. This leads to several attractive consequences. The method to calculate weights is relatively easy, involving a limited set of homoĉ genous strata and resulting in a more effective use of data.
Because of the applied sampling procedure (see section 2.1) the different strata have different sampling fractions. Strata with relatively homogenous units have a lower sampling fraction than very heterogeneous strata. This also implies that farms have very diverging weights. Farms from a homogenous cluster will have a larger weight (in principal the reciprocal of the sampling fraction) and therefore represent a larger number of farms. The differences in sampling fracĉ tions are shown in table 6.1. These percentages are calculated by dividing the required number of farms in the selection plan (table 5.1) by the number of population units (table 3.3).   Every year all horticultural and agricultural farms are registered in the agriĉ cultural census, but this registration only represents the situation at a certain moment during the year. Therefore it is possible that farms are missing from this registration. Furthermore, the trend is for number of farms to fall signifiĉ cantly (this trend is stronger for certain types of farms and less strong for othĉ ers). As a consequence estimations for the year 2006 might be overestimations of reality.
Distortions in the number of farms in the census can therefore cause incorĉ rect estimations of aggregates.
Furthermore, the typology of farms according to the agricultural census might differ from the typology according to the FADN data. The census reflects the situation at a certain point in time, while the FADN system describes the farm during a whole year. In order to take these differences into account two weighting methodologies are available in the Dutch FADN system.

Introduction
This section focuses on the quality of the estimations based on the 2006 FADN sample. Figure 6.1 shows the same structure as displayed in figure 2.1, but it adds the quality aspects. Section 6.3.2 provides information on the coverage of the sample; the coverage compares the total populations as described by the census and the field of observation of the FADN sample. Section 6.3.3 analyses the extent to which distortions might occur between the sample and the populaĉ tion due to over or under representation of farms with specific characteristics; it compares the characteristics between the field of observation and the actual FADN sample. Section 6.3.4 provides information on the reliability of estimates as made based on the FADN sample. The last quality aspect listed in figure 6.1, the response rate and the nonĉresponse, has already been described in the preĉ vious chapter.

Coverage
It is desirable to have a sample that represents the population as well as possiĉ ble. A clear distinction should be made between the coverage and the represenĉ tativeness. This section describes the coverage, section 6.3.3 deals with the representativeness. To get an idea about the extent to which the total populaĉ tion is covered by the sample it is relevant to distinguish several aspects. Farms that are too small or are not registered in time are not part of the agricultural census (b). The sampling frame (c) is the basis for the choice of sample farms and consists of farms registered in the agricultural census and have a size of more then 16 ESU and less then 2,000 ESU. From this sampling frame the sample is drawn (d).
In policy analysis and research it is essential to distinguish between farming types (for example specialised pig fattening farms) and agricultural activities (pig fattening). In the report on the redesign of the FADN sample it was illusĉ trated that types of farming should not only be the focus of research . Agricultural activities are important in many research projects.  Table 6.2 gives an indication to what extent the FADN sample covers the whole population. Therefore a comparison is made between the farms in the sampling framework (all the farms that have a chance of being included in the FADN sample) (c) and the total population as described by the agricultural cenĉ sus (b). Direct comparison with all farms (a) would be better but the unregisĉ tered farms are unknown, and the practical difference is very limited. The sampling framework covers the population to a large extent. For example with respect to the production, almost 93% is covered by the sample. Small farms are excluded from the sampling framework, this means that a substantial numĉ ber of the farms and to a lesser extent also of labour are outside of the samĉ pling frame. With respect to agricultural activities, the table shows that some activities are not well covered by the sample. This mainly concerns the activities that are commonly found on very small or on very large specialised farms.  To give a complete picture of a certain agricultural activity it is therefore imĉ portant to look at the activities on all farm types. For example, not only pig fatĉ tening farms will create added value from pig fattening, also other types of farms can be involved in this activity (although it is not their main business). The next table describes to which extent a certain activity can be found on certain types of farming. The figures in italics express that an activity belongs to that type of farming (based on the principal types of farming). For example, 82.8% of the agricultural activity fattening pigs can be found on the intensive livestock farms. This means that 17.2% of this activity can be found on farms that belong to other types of farming, for example arable farms. Looking in more detail, the skewness is even larger. Type of farming 5011, the specialised pig fattening farms are responsible for 56% of the pig fattening activity. This implies that 44% of this activity takes place within other types. Production of mushrooms is a highly specialised agricultural activity. More than 99% of this activity takes place on specialised mushroom farms.

Representativeness
Because of the stratification scheme the sample will provide a good representaĉ tion of the population on the main characteristics (stratification variables) at the beginning of a year. During the year farms might drop out of the sample and changes might occur in the population. Despite these changes the representaĉ tiveness is maintained by applying postĉstratification on the resulting sample and the changed population. Representativeness with respect to the stratification variables does not necessary imply that the sample is representative for all variĉ ables. Such a full representativeness is impossible unless the sample size apĉ proximates the whole population. Table 6.4 shows to what extent the sample is representative for a number of variables in the agricultural census.

52
The following guideline can help in the interpretation of the table: a relative difference which is close to the relative standard error cannot be regarded as proof of systematic differences between the sample and the population. If the relative difference is more than two times the relative standard error then it is less likely that these differences can be explained by sampling errors. It is very unlikely that the difference is caused by coincidence if the relative difference is more than 3 times the relative standard error.
An example can illustrate how the table should be interpreted. The average number of DSU (Dutch size units) of pigs as measured in the 2006 agricultural census is 7.68 (i.e. the average of all farms within the field of observation). If the same variable is estimated based on the FADN sample an average of 8.18 is calculated. It might seem that the number of pigs is slightly overestimated in the sample. However, the relative standard error of the estimate is 3.3%. When this standard error is compared to the relative difference between both values (6%), then the conclusion that there is a significant difference, cannot be supĉ ported. The information in table 6.4 gives an indication for which variables and conseĉ quently for which research projects it might be wise to perform postĉstratification or use alternative estimation techniques to take into account the differences between the sample and the population. For example, in studies in which the age of the farmer plays an important role it might be useful to apply alternative estimation techniques.
The last two columns of table 6.4 provide more detailed information on the difference between the population and the sample. These differences can be explained on one hand by differences in the number of farms on which a certain activity occurs (a value larger than zero) and on the other by the average of this activity on farms which are in this activity. For example: the number of DSU dairy cows in the FADN is higher than in the agricultural census. This difference is partly explained by a higher estimation of the number of farms with dairy cows and partly by a lower estimation of DSU of dairy cows on farms with dairy cows (95.2 = 96.3% * 98.9).
A comparison between the sample and the population as registered in the agricultural census does not fully answer the question whether estimations of fiĉ nancial, economic and technical characteristics are bias free. It is for example possible that farms with relatively good or bad management skills and therefore performance are over represented in the sample.

Reliability
The previous subsection provides some indicators whether there are systematic differences between the sample and the population (representativeness of samĉ ple). This section focuses on the reliability of the estimates.
The calculation of averages of groups based on sampling units implies that there can be differences between the estimated value and true population value. These differences can occur due to the random selection of units to be included in the sample. Table 6.5 provides an indication of the level of precision of the estimates for a set of important goal variables.
The precision of estimates can be measured by the standard error of the esĉ timate of a variable. The standard error is used to calculate the confidence inĉ terval. This confidence interval describes the range in which the true population value will be given a certain level of certainty. The confidence interval ranges from the calculated average minus two times the standard error to the calcuĉ lated average plus two times the standard error. The calculated averages of two groups are significantly different (with a 95% certainty) if the difference is larger than two times the square root of the sum of squares of the standard errors of the two group averages.
This section provides the reliability of estimates for a number of important goal variables for different types of farming. This calculation is based on the available CSP observations (see section 5.3).    There are clear differences in the significance of estimates between different types of farming. The estimates for the dairy sector are the most reliable beĉ cause of the large number of farms included in the sample, which reflects the importance of the dairy sector in Dutch agriculture. The decision on the number of farms is described in Vrolijk and Lodder (2002).
Tables 6.7 and 6.8 describe the relative standard error (coefficient of variĉ ance). This is the standard error divided by the group average. A higher relative standard error implies less reliable estimates, but the value is strongly affected by the absolute value of the average. If the average value approaches zero, the relative standard error can become very large. A meaningful evaluation of the standard error requires a simultaneous use of tables 6.5 and 6.6 on one hand and tables 6.7 and 6.8 on the other.