Staff Publications

Staff Publications

  • external user (warningwarning)
  • Log in as
  • language uk
  • About

    'Staff publications' is the digital repository of Wageningen University & Research

    'Staff publications' contains references to publications authored by Wageningen University staff from 1976 onward.

    Publications authored by the staff of the Research Institutes are available from 1995 onwards.

    Full text documents are added when available. The database is updated daily and currently holds about 240,000 items, of which 72,000 in open access.

    We have a manual that explains all the features 

Current refinement(s):

Records 1 - 20 / 253

  • help
  • print

    Print search results

  • export
    A maximum of 250 titles can be exported. Please, refine your queryYou can also select and export up to 30 titles via your marked list.
  • alert
    We will mail you new results for this query: keywords==databases
Check title to add to marked list
D3.5 Formalized stepwise approach for implementing logistical concepts using BeWhere and LocaGIStics : S2Biom Project Grant Agreement No. 608622
Annevelink, E. ; Elbersen, B. ; Leduc, S. ; Staritsky, I.G. - \ 2016
S2Biom - 41 p.
biomassa - modellen - databanken - hulpbronnengebruik - biobased economy - duurzaamheid (sustainability) - europa - logistiek - biomass - models - databases - resource utilization - sustainability - europe - logistics
This deliverable describes a formaliz
logistical concepts in the practical
chains and for assessing thei
BeWhere and LocaGIStics. It describes
these two logistical assessment tools
interlinked so that LocaGIStics can further refine and detail the outcomes of the
BeWhere model and that the BeWhere model can use the outcome of the
LocaGIStics model to modify their calculations if needed.
The BeWhere model supports the development of EU
develop an optimal network of biomass delivery chains
techno-economic spatial model that enables the optimal design and allocation of
biomass delivery chains (at national level) based on the minimizatio
emissions of the full supply chain taking account economies of scale, in order to meet
certain demand.
LocaGIStics is a regional assessment tool for biomass delivery chains. This tool can
support the user to design optimal biomass deliver
level and analyze in a comparative way (for different biomass delivery chains) the
spatial implications and the environmental and economic performance. It will take
account of the biomass cost
options and novel logistical concepts.
formalized stepwise approach for implementing optimal
design of national and regional
their economic and GHG performance
. the functionality of and the relation between
tools. BeWhere and LocaGIStics are closely
ked EU-wide and national strategies to
chains. The basis of this tool is a
delivery chains and networks at regional
cost-supply, the conversion and pre-treatment technology
D3.4 + D3.6: Cover report Results logistical case studies : S2Biom Project Grant Agreement n°608622
Annevelink, E. ; Gabrielle, B. ; Carozzi, M. ; Garcia Galindo, D. ; Espatolero, S. ; Izquierdo, M. ; Väätäinen, K. ; Anttila, P. ; Staritsky, I.G. ; Vanmeulebrouk, B. ; Elbersen, B. ; Leduc, S. - \ 2016
S2Biom - 23 p.
biobased economy - europa - databanken - modellen - duurzaamheid (sustainability) - hulpbronnengebruik - biomassa - logistiek - europe - databases - models - sustainability - resource utilization - biomass - logistics
The S2Biom project - Delivery of sustainable supply of non-food biomass to support a “resource-efficient” Bioeconomy in Europe - supports the sustainable delivery of nonfood biomass feedstock at local, regional and pan European level through developing strategies, and roadmaps that will be informed by a “computerized and easy to use” toolset (and respective databases) with updated harmonized datasets at local, regional, national and pan European level for EU28, Western Balkans, Moldova, Turkey and Ukraine. A case based approach was followed, where optimal logistical concepts (conceptual designs) were matched with the specific regional situation. This was done in three logistical case studies that were performed: 1. Small-scale power production with straw and Miscanthus in the Burgundy region (France); 2. Large-scale power production with straw and with residual woody biomass in the Aragon region (Spain); 3. Advanced wood logistics in the Province of Central Finland.
D3.4 + D3.6: Annex 1 Results logistical case study Burgundy : S2Biom Project Grant Agreement n°608622
Annevelink, E. ; Gabrielle, B. ; Carozzi, M. ; Staritsky, I.G. ; Vanmeulebrouk, B. ; Elbersen, B. ; Leduc, S. - \ 2016
S2Biom - 48 p.
biobased economy - biomass - sustainability - resource utilization - europe - databases - models - logistics - biomassa - duurzaamheid (sustainability) - hulpbronnengebruik - europa - databanken - modellen - logistiek
The S2Biom project - Delivery of sustainable supply of non-food biomass to support a “resource-efficient” Bioeconomy in Europe - supports the sustainable delivery of nonfood biomass feedstock at local, regional and pan European level through developing strategies, and roadmaps that will be informed by a “computerized and easy to use” toolset (and respective databases) with updated harmonized datasets at local, regional, national and pan European level for EU28, Western Balkans, Moldova, Turkey and Ukraine. In the S2Biom project the logistical case study in Burgundy was the first that was performed. In this report the assessment methods for the logistical case study are described in Chapter 2. This is followed by the set-up of the Burgundy case study in Chapter 3. In Chapter 4 the type of data needed and in Chapter 5 the actual data used are described. Then the results are presented that were obtained by the BeWhere (Chapter 6) and by the LocaGIStics model (Chapter 7). Conclusions and recommendations are given in Chapter 8.
Making eco logic and models work : an integrative approach to lake ecosystem modelling
Kuiper, Jan Jurjen - \ 2016
Wageningen University. Promotor(en): W.M. Mooij, co-promotor(en): J.H. Janse; Jeroen de Klein. - Wageningen : Wageningen University - ISBN 9789462579446 - 192
ecology - models - ecosystems - modeling - aquatic ecology - water management - water quality - databases - ecologie - modellen - ecosystemen - modelleren - aquatische ecologie - waterbeheer - waterkwaliteit - databanken

Dynamical ecosystem models are important tools that can help ecologists understand complex systems, and turn understanding into predictions of how these systems respond to external changes. This thesis revolves around PCLake, an integrated ecosystem model of shallow lakes that is used by both scientists and water quality managers to understand and predict eutrophication effects in shallow lake ecosystems. Shallow lakes provide some of the clearest examples of alternative stable states in natural systems. PCLake can be used to calculate the critical nutrient loading, that is, the nutrient loading where an abrupt regime shift occurs from a clear aquatic plant dominated state to a turbid phytoplankton dominated state, or vice versa. Four different aspects of modelling with PCLake are addressed in this thesis: (1) making the model better accessible for the modelling community, (2) improving the model, (3) developing scientific theory, and (4) exploring new applications for water quality management.

Following a general introduction to the thesis in chapter 1, the Database Approach To Modelling (DATM) is introduced in chapter 2. DATM is invented to make dynamic models more accessible. The idea of DATM is that the mathematical equations of a model are stored in a database independently of program language and software specific information. From the database, the information can be automatically translated, augmented and compiled into working model code of various different modelling frameworks (software programs).

In chapter 3 the weak link between ecosystem models and real ecosystems is discussed in relation to model calibration and improvement. In a previous stage PCLake has been calibrated using data of more than 40 lakes to obtain a best overall fit, which has greatly increased the scope of the model by making it suitable for more generalized studies on temperate shallow lakes. However, because of this calibration, adding missing functional components to the model at a later stage does not automatically increase the validity of the model, as it may bring the model ‘out of balance’. This is exemplified by adding filter feeding zoobenthos to PCLake, which were previously ignored.

In chapter 4, the relation between food-web theory and alternative stable states theory is scrutinized. Both theoretical paradigms are highly influential in modern ecology as they help scientists understand how stability emerges in complex natural ecosystems. Unfortunately, they developed independently and it is largely unclear how the resilience of a food web relates to the stability of the complete ecosystem. For this study PCLake was used as a virtual reality from which ‘empirical’ information is sampled to parameterize a food web model, following traditional food web methods. This allowed calculating the stability of the food web along a gradient of environmental change, knowing that the complete ecosystem shows a regime shift once the critical nutrient loading is exceeded.

In chapter 5 the question is asked to what extent models of a different form can be used to describe the same natural phenomenon, and hence, how these models can be used for a better understanding of such natural phenomena. Using three classical extensions of the famous Lotka-Volterra equations, which unlike PCLake can be fully mathematically understood, we analyze the consequence of changing a system with a sophisticated functional response term (e.g. Holling type II or III) into a system with a simpler functional response term while maintaining equilibrium densities and material fluxes. These results give new insight into when empirical data can be linked to mathematical models to estimate the stability properties of real ecosystems.

Although PCLake is predominantly applied in the context of ecosystem restoration of turbid phytoplankton dominated lakes, chapter 6 focusses on the clear water state after the reestablishment of aquatic plant dominance as occured. Dense stands of aquatic plants easily cause nuisance, and hence the removal of aquatic plants is an emerging management issue. Yet, because aquatic plants play an important role in stabilizing the clear water state, the removal of plant biomass can potentially trigger a critical transition back to the turbid water state. Currently there is only limited empirical and theoretical understanding of how harvesting of aquatic plants affects ecosystem functioning, which frustrates effective and efficient ecosystem management. With PCLake the impact of harvesting is evaluated, in terms of reducing nuisance and ecosystem stability, for a wide range of external nutrient loadings, mowing intensities and timings. Additionally, the model is used to estimate how much phosphorus is removed from the system during harvesting.

In chapter 7 I discuss the added value of taking an integrative approach to modelling, and discuss the integrated nature of the studies presented in this thesis. It’s also important to note that these studies were part of a larger research project with the overall aim of increasing the usefulness and the validity of PCLake and its twin model PCDitch, and to enhance the confidence in the models among water quality managers. A synopsis of the overarching collaborative research project on PCLake and PCDitch is presented in chapter 8.

Landelijke Vegetatie Databank : technische documentatie, Status A
Hennekens, S.M. ; Boss, M. ; Schmidt, A.M. - \ 2016
Wageningen : Statutory Research Tasks Unit for Nature & the Environment (WOt-technical report 74) - 59
vegetatie - databanken - planten - nederland - vegetation - databases - plants - netherlands
Dit document bevat een beschrijving van de technische omgeving, hulpmiddelen en modellen die van belang zijn voor het beheer van de landelijke vegetatiedatabank. Het is bedoeld om de processen en procedures
vast te leggen. Het verkrijgen van kwaliteitsstatus A is hierbij geen doel op zich maar is wel de stip op de horizon waar dit document aan bijdraagt. Het doel van de landelijke vegetatiedatabank is het op een gestructureerde manier vastleggen van gegevens over het voorkomen van vegetaties en daarmee ook
plantensoorten in Nederland. De procedures voor het verzamelen en beheren van deze gegevens zijn beschreven in dit document
Surface WAter Scenario Help (SWASH) version 5.3 : technical description
Roller, J.A. te; Berg, F. van den; Adriaanse, P.I. ; Jong, A. de; Beltman, W.H.J. - \ 2015
Wageningen : Statutory Research Tasks Unit for Nature & the Environment (WOT Natuur & Milieu) (WOt-technical report 27) - 77
sloten - waterkwaliteit - pesticiden - emissiereductie - modellen - databanken - ditches - water quality - pesticides - emission reduction - models - databases
The user-friendly shell SWASH, acronym for Surface WAter Scenarios Help, assists the user in calculating pesticide exposure concentrations in the EU FOCUS surface water scenarios. SWASH encompasses five separate tools and models: (i) FOCUS Drift Calculator, calculating pesticide entries through spray drift deposition, (ii) PRZM-3, calculating pesticide entries through run-off, (iii) MACRO, calculating pesticide entries through drainage, (iv) TOXSWA, calculating the behaviour of pesticides in small surface waters, and (v) SPIN is a central database for storage and editing of pesticide properties. The SWASH database contains information on projects and runs created by the user. This report gives a detailed description of the necessary flow of data between the various models, to make them communicate smoothly with each other. It also specifies the installation requirements for the MACRO, PRZM and TOXSWA models. The MACRO model uses an MS-Access database to store its substance and run information, while PRZM makes use of separate data files. TOXSWA uses the central SWASH database. After completing a SWASH session the user should manually perform simulations with the three individual models
KringloopWijzer verder verbeterd
Have, H. ten; Haan, M.H.A. de - \ 2015
V-focus 12 (2015)1. - ISSN 1574-1575 - p. 36 - 37.
melkveehouderij - kringlopen - mineralen - bemesting - fosfaat - mestoverschotten - databanken - governance - dairy farming - cycling - minerals - fertilizer application - phosphate - manure surpluses - databases
Sinds 1 januari 2015 moeten melkveehouders met een fosfaatoverschot de KringloopWijzer invullen. De KringloopWijzer is voor 2015 weer aangepast en verbeterd; een nieuwe versie kan worden gedownload op de website Op 6 januari werd de Centrale Database Kringloopwijzer in gebruik genomen. Ondanks de verbeteringen kunnen agrarische adviseurs nog allerlei vragen verwachten van ‘hun’ veehouders.
Registreren: meten is weten!
Maurice - Van Eijndhoven, M.H.T. ; Oldenbroek, J.K. - \ 2015
Zeldzaam huisdier 40 (2015)2. - ISSN 0929-905X - p. 10 - 11.
rassen (dieren) - dierveredeling - selectie - registratie - selectief fokken - fenotypen - fokdoelen - stamboeken - databanken - breeds - animal breeding - selection - registration - selective breeding - phenotypes - breeding aims - herdbooks - databases
Om te kunnen selecteren op bepaalde kenmerken moeten deze ‘meetbaar’
zijn en worden geregistreerd. Pas dan wordt zichtbaar of er echte
verbeteringen richting fokdoel worden bereikt in volgende generaties.
In dit tweede artikel lichten we toe waarom het belangrijk is om goed
te registeren en hoe een gedegen registratiesysteem eruitziet.
Landelijke Vegetatie Databank : technische documentatie
Hennekens, S.M. ; Boss, M. ; Schmidt, A.M. - \ 2014
Wageningen : Wettelijke Onderzoekstaken Natuur & Milieu (WOt-technical report 30) - 46
vegetatietypen - databanken - gezamenlijke gegevens - vegetatiemonitoring - nederland - vegetation types - databases - aggregate data - vegetation monitoring - netherlands
Dit document bevat een beschrijving van de technische omgeving, hulpmiddelen en modellen die van belang zijn voor het beheer van de landelijke vegetatiedatabank. Het is bedoeld om de processen en procedures vast te leggen. Het verkrijgen van kwaliteitsstatus A is hierbij geen doel op zich maar is wel de stip op de horizon waar dit document aan bijdraagt. De landelijke vegetatiedatabank dient ervoor om op een gestructureerde manier gegevens vast te leggen over het voorkomen van vegetaties en daarmee ook van plantensoorten in Nederland. De procedures voor het verzamelen en beheren van deze gegevens zijn beschreven in dit document.
Endnote X7
Brouwer, J.H.D. ; Renkema, J.M.S. ; Kersten, A.M.P. - \ 2014
[Wageningen] : Wageningen UR Library - 42
computer software - bibliografieën - databanken - indexeren - literatuur - publicaties - documentatie - bibliographies - databases - indexing - literature - publications - documentation
Habitattypen in Natura 2000-gebieden : beoordeling van oppervlakte, representativiteit en behoudsstatus in de Standard Data Forms (SDFs)
Janssen, J.A.M. ; Weeda, E.J. ; Schippers, P. ; Bijlsma, R.J. ; Schaminee, J.H.J. ; Arts, G.H.P. ; Deerenberg, C.M. ; Bos, O.G. ; Jak, R.G. - \ 2014
Wageningen : Wettelijke Onderzoekstaken Natuur & Milieu (WOt-technical report 8)
natura 2000 - habitats - habitatrichtlijn - oppervlakte (areaal) - natuurgebieden - databanken - nederland - inventarisaties - gegevensbeheer - habitats directive - acreage - natural areas - databases - netherlands - inventories - data management
In dit rapport wordt een onderbouwing gegeven van de zogenaamde Standaard Data Forms (SDF’s) vanhabitattypen van de Annex I van de Habitatrichtlijn. Deze SDF’s maken onderdeel uit van de database over de Natura 2000-gebieden in ons land. De database speelt een rol bij eventuele juridische procedures en wordt door de Europese Commissie gebruikt om de voorstellen voor Natura 2000-gebieden van EU-lidstaten te beoordelen. In dit rapport is aangegeven op welke informatie de ecologische gegevens van de habitattypen in de Natura 2000-gebieden zijn beoordeeld. Het betreft gegevens over de oppervlakte van een habitat, de representativiteit, de relatieve oppervlakte ten opzichte van de oppervlakte in heel Nederland, de behoudsstatus van het habitattype, en een algemene evaluatie. Deze gegevens zijn aan de hand van uitgebreide maatlatten gescoord voor alle habitatrichtlijngebieden waarin een habitattype voorkomt. In totaal gaat het om ruim 1130 combinaties van habitattypen en gebieden.
Habitatrichtlijnsoorten in Natura 2000-gebieden : beoordeling van populatie, leefgebied en isolatie in de Standard Data Forms
Ottburg, F.G.W.A. ; Janssen, J.A.M. - \ 2014
Wageningen : Wettelijke Onderzoekstaken Natuur & Milieu (WOt-technical report 9) - 105
natura 2000 - habitatrichtlijn - soortendiversiteit - natuurgebieden - databanken - nederland - inventarisaties - gegevensbeheer - habitats directive - species diversity - natural areas - databases - netherlands - inventories - data management
In dit rapport wordt een toelichting gegeven op de zogenaamde Standaard Data Forms (SDFs) van soorten van de Annex II van de Habitatrichtlijn. Deze SDFs maken onderdeel uit van de database over de Natura 2000-gebieden in ons land. De database speelt een rol bij eventuele juridische procedures en wordt door de Europese Commissie gebruikt bij het beoordelen van de voorstellen voor Natura 2000-gebieden van EU-lidstaten. In dit rapport wordt aangegeven op welke informatie de ecologische gegevens in de SDF zijn gebaseerd. Het betreft gegevens over de aantallen van een soort, het type populatie, de relatieve populatie, de kwaliteit van het leefgebied, de ligging van de populatie in het Europese areaal, en een algemene evaluatie. Deze gegevens zijn aan de hand van eenvoudige maatlatten gescoord voor alle habitatrichtlijngebieden waarin een soort voorkomt. In totaal gaat het om 373 combinaties van soorten en gebieden die worden toegelicht
Status en trend van structuur- en functiekenmerken van Natura 2000-habitattypen op basis van het Landelijk Meetnet Flora (LMF) en de Landelijke Vegetatie Databank (LVD) : achtergronddocument voor de Artikel 17-rapportage
Knegt, B. de; Meij, T. van der; Hennekens, S.M. ; Janssen, J.A.M. ; Wamelink, G.W.W. - \ 2014
Wageningen : Wettelijke Onderzoekstaken Natuur & Milieu (WOt-technical report 7)
natura 2000 - habitats - vegetatietypen - flora - monitoring - databanken - gegevensbeheer - nederland - vegetation types - databases - data management - netherlands
Dit rapport geeft per habitattype een overzicht van de status en trend van de structuur- & functiekenmerken voor de artikel 17-rapportage van de Habitatrichtlijn. Om deze data op reproduceerbare wijze te verkrijgen, is gebruik gemaakt van de veldmetingen die zijn verricht voor het landelijk Meetnet Flora (LMF) en de Landelijke Vegetatie Databank (LVD). De resultaten laten zien dat voor de meeste habitattypen voldoende meetgegevens uit het LMF en de LVD beschikbaar zijn om statistisch betrouwbare uitspraken te doen, alhoewel niet voor alle gevraagde veldmetingen van structuur- en functiekenmerken. De resultaten geven per habitattype weer in hoeverre voldaan wordt aan de gestelde normen voor de abiotische en biotische structuur- en functiekenmerken. Daarnaast worden enkele aanbevelingen gedaan om de analyses in het vervolg te verbeteren
The MAGNET Model: Module description
Woltjer, G.B. ; Kuiper, M. ; Kavallari, A. ; Meijl, H. van; Powell, J.P. ; Rutten, M.M. ; Shutes, L.J. ; Tabeau, A.A. - \ 2014
The Hague : LEI Wageningen UR (Manual / LEI 14-57) - 146
modellen - modelleren - databanken - simulatiemodellen - evenwicht - models - modeling - databases - simulation models - equilibrium
Text mining for metabolic reaction extraction from scientific literature
Risse, J.E. - \ 2014
Wageningen University. Promotor(en): Ton Bisseling; Jack Leunissen, co-promotor(en): P.E. van der Vet. - Wageningen : Wageningen University - ISBN 9789461739001 - 138
metabolomica - gegevensanalyse - databanken - text mining - publicaties - wetenschappelijk onderzoek - moleculaire biologie - thesauri - enzymen - metabolieten - metabolomics - data analysis - databases - publications - scientific research - molecular biology - enzymes - metabolites

Science relies on data in all its different forms. In molecular biology and bioinformatics in particular large scale data generation has taken centre stage in the form of high-throughput experiments. In line with this exponential increase of experimental data has been the near exponential growth of scientific publications. Yet where classical data mining techniques are still capable of coping with this deluge in structured data (Chapter 2), access of information found in scientific literature is still limited to search engines allowing searches on the level keywords, titles and abstracts. However, large amounts of knowledge about biological entities and their relations are held within the body of articles. When extracted, this data can be used as evidence for existing knowledge or hypothesis generation making scientific literature a valuable scientific resource. To unlock the information inside the articles requires a dedicated set of techniques and approaches tailored to the unstructured nature of free text. Analogous to the field of data mining for the analysis of structured data, the field of text mining has emerged for unstructured text and a number of applications has been developed in that field.

This thesis is about text mining in the field of metabolomics. The work focusses on strategies for accessing large collections of scientific text and on the text mining steps required to extract metabolic reactions and their constituents, enzymes and metabolites, from scientific text. Metabolic reactions are important for our understanding of metabolic processes within cells and that information provides an important link between genotype phenotype. Furthermore information about metabolic reactions stored in databases is far from complete making it an excellent target for our text mining application.

In order to access the scientific publications for further analysis they can be used as flat text or loaded into database systems. In Chapter 2we assessed and discussed the capabilities and performance of XML-type database systems to store and access very large collections of XML-type documents in the form of the Medline corpus, a collection of more than 20 million of scientific abstracts. XML data formats are common in the field of bioinformatics and are also at the core of most web services. With the increasing amount of data stored in XML comes the need for storing and accessing the data. The database systems were evaluated on a number of aspects broadly ranging from technical requirements to ease-of-use and performance. The performance of the different XML-type database systems was measured Medline abstract collections of increasing size and with a number of different queries. One of the queries assessed the capabilities of each database system to search the full-text of each abstract, which would allow access to the information within the text without further text analysis. The results show that all database systems cope well with the small and medium dataset, but that the full dataset remains a challenge. Also the query possibilities varied greatly across all studied databases. This led us to conclude that the performances and possibilities of the different database types vary greatly, also depending on the type of research question. There is no single system that outperforms the others; instead different circumstances can lead to a different optimal solution. Some of these scenarios are presented in the chapter.

Among the conclusions of Chapter 2is that conventional data mining techniques do not work for the natural language part of a publication beyond simple retrieval queries based on pattern matching. The natural language used in written text is too unstructured for that purpose and requires dedicated text mining approaches, the main research topic of this thesis. Two major tasks of text mining are named entity recognition, the identification of relevant entities in the text, and relation extraction, the identification of relations between those named entities. For both text mining tasks many different techniques and approaches have been developed. For the named entity recognition of enzymes and metabolites we used a dictionary-based approach (Chapter 3) and for metabolic reaction extraction a full grammar approach (Chapter 4).

In Chapter 3we describe the creation of two thesauri, one for enzymes and one for metabolites with the specific goal of allowing named entity identification, the mapping of identified synonyms to a common identifier, for metabolic reaction extraction. In the case of the enzyme thesaurus these identifiers are Enzyme Nomenclature numbers (EC number), in the case of the metabolite thesaurus KEGG metabolite identifiers. These thesauri are applied to the identification of enzymes and metabolites in the text mining approach of Chapter 4. Both were created from existing data sources by a series of automated steps followed by manual curation. Compared to a previously published chemical thesaurus, created entirely with automated steps, our much smaller metabolite thesaurus performed on the same level for F-measure with a slightly higher precision. The enzyme thesaurus produced results equal to our metabolite thesaurus. The compactness of our thesauri permits the manual curation step important in guaranteeing accuracy of the thesaurus contents, whereas creation from existing resources by automated means limits the effort required for creation. We concluded that our thesauri are compact and of high quality, and that this compactness does not greatly impact recall.

In Chapter 4we studied the applicability and performance of a full parsing approach using the two thesauri described in Chapter 3 for the extraction of metabolic reactions from scientific full-text articles. For this we developed a text mining pipeline built around a modified dependency parser from the AGFL grammar lab using a pattern-based approach to extract metabolic reactions from the parsing output. Results of a comparison to a modified rule-based approach by Czarnecki et al.using three previously described metabolic pathways from the EcoCyc database show a slightly lower recall compared to the rule-based approach, but higher precision. We concluded that despite its current recall our full parsing approach to metabolic reaction extraction has high precision and potential to be used to (re-)construct metabolic pathways in an automated setting. Future improvements to the grammar and relation extraction rules should allow reactions to be extracted with even higher specificity.

To identify potential improvements to the recall, the effect of a number of text pre-processing steps on the performance was tested in a number of experiments. The one experiment that had the most effect on performance was the conversion of schematic chemical formulas to syntactic complete sentences allowing them to be analysed by the parser. In addition to the improvements to the text mining approach described in Chapter 4I make suggestions in Chapter 5 for potential improvements and extensions to our full parsing approach for metabolic reaction extraction. Core focus here is the increase of recall by optimising each of the steps required for the final goal of extracting metabolic reactions from the text. Some of the discussed improvements are to increase the coverage of the used thesauri, possibly with specialist thesauri depending on the analysed literature. Another potential target is the grammar, where there is still room to increase parsing success by taking into account the characteristics of biomedical language. On a different level are suggestions to include some form of anaphora resolution and across sentence boundary search to increase the amount of information extracted from literature.

In the second part of Chapter 5I make suggestions as to how to maximise the information gained from the text mining results. One of the first steps should be integration with other biomedical databases to allow integration with existing knowledge about metabolic reactions and other biological entities. Another aspect is some form of ranking or weighting of the results to be able to distinguish between high quality results useful for automated analyses and lower quality results still useful for manual approaches. Furthermore I provide a perspective on the necessity of computational literature analysis in the form of text mining. The main reasoning here is that human annotators cannot keep up with the amount of publications so that some form of automated analysis is unavoidable. Lastly I discuss the role of text mining in bioinformatics and with that also the accessibility of both text mining results and the literature resources necessary to create them. An important requirement for the future of text mining is that the barriers around high-throughput access to literature for text mining applications have to be removed. With regards to accessing text mining results, there is a long way to go for many applications, including ours, before they can be used directly by biologists. A major factor is that these applications rarely feature a suitable user interface and easy to use setup.

To conclude, I see the main role of a text mining system like ours mainly in gathering evidence for existing knowledge and giving insights into the nuances of the research landscape of a given topic. When using the results of our reaction extraction system for the identification of ‘new’ reactions it is important to go back to the actual evidence presented for extra validations and to cross-validate the predictions with other resources or experiments. Ideally text mining will be used for generation of hypotheses, in which the researcher uses text mining findings to get ideas on, in our case, new connections between metabolites and enzymes; subsequently the researcher needs to go back to the original texts for further study. In this role text mining is an essential tool on the workbench of the molecular biologist.

The International Lactuca database
Treuren, R. van; Menting, F.B.J. - \ 2014
plant genetic resources, gene banks
The International Lactuca Database includes accessions of species belonging to the genus Lactuca, but also a few accessions belonging to related genera. Passport data can be searched on-line or downloaded. Characterization and evaluation data can be accessed via the downloading section. Requests for seed material from accessions included in the database should be directed to the institute that maintains the accession. An overview of the holding institutes is provided in the section "contributors". The database concentrates on passport data of all Lactuca species of germplasm collections worldwide
Database in de maak
Niekerk, T.G.C.M. ; Reuvekamp, B.F.J. ; Bestman, M.W.P. - \ 2013
De Pluimveehouderij 43 (2013)2. - ISSN 0166-8250 - p. 27 - 27.
pluimveehouderij - databanken - uitloop - hennen - dierenwelzijn - huisvesting van kippen - verenpikken - poultry farming - databases - outdoor run - hens - animal welfare - chicken housing - feather pecking
Het project Low Input Breeds zoekt de ideale uitloopkip. Na een eerste inventarisatie wordt nu in detail een groot aantal koppels gevolgd. Alvast wat eerste resultaten.
ICW-nota's digitaal beschikbaar
Massop, H.T.L. ; Bolt, F.J.E. van der - \ 2013
Stromingen : vakblad voor hydrologen 19 (2013)1. - ISSN 1382-6069 - p. 53 - 54.
hydrologie - bodemwater - grondwater - oppervlaktewater - wetenschappelijk onderzoek - publicaties - databanken - hydrology - soil water - groundwater - surface water - scientific research - publications - databases
Het ICW is opgericht naar aanleiding van de watersnood van 1953. De door zout water geïnundeerde gronden weer geschikt maken voor landbouwkundige productie ging daarmee gepaard. Het onderzoek van destijds is o.a. vastgelegd in de zogenaamde ICW nota's. Anno 2013 zijn er vele rapporten gescand en daarmee beter toegankelijk gemaakt voor het werkveld van de hydrologie
Basiskaart Natuur 2004 : van versie 1.0 naar 3.1
Kramer, H. ; Clement, J. ; Knegt, B. de - \ 2013
Wageningen : Wettelijke Onderzoekstaken Natuur & Milieu (WOt-werkdocument 313) - 48
landgebruik - bossen - oppervlakte (areaal) - databanken - monitoring - inventarisaties - natuurgebieden - land use - forests - acreage - databases - inventories - natural areas
De hoofddoelstelling van deze rapportage is het verkrijgen van de KwaliteitsStatus A voor het vervaardigen van de BN2004_v3.1 die vervaardigd is vanuit de Geodatabase Natuur (zie par. 2.9). Een belangrijk aspect hierbij is de vergelijkbaarheid van de nieuwe versie (3.1) met de oorspronkelijke versie BN2004_V2.0. De toegepaste techniek die gebruikt is bij de vervaardiging van beide bestanden verschilt namelijk. Om duidelijk te krijgen of het verschil in gebruikte techniek ook leidt tot verschil in resultaat is de het bestand BN2004_v2.1 aangemaakt. BN2004_v2.1 is vervaardigd vanuit de Geodatabase Natuur waarbij exact dezelfde bronbestanden en dezelfde combinatieregels gebruikt zijn als bij BN2004_v2.0. Dit was noodzakelijk omdat in BN2004_v3.1 meerdere wijzigingen zijn doorgevoerd ten opzichte van 2.0 waardoor deze niet direct vergelijkbaar was.
Report of ad hoc meeting of the Chairs, Vice-Chairs and Central Crop Database Managers of the ECPGR Solanaceae and Cucurbits Working Groups
Díez Niclós, M.J. ; Valcárcel, J.V. ; Íñigo, A.G. ; Dooijeweert, W. van; Menting, F. ; Weerden, G. van der; Daunay, M. - \ 2013
Valencia, Spain : ECPGR - 5
genenbanken - databanken - solanaceae - cucurbitaceae - werkgroepen - gene banks - databases - working groups
The scope of this two-day ad hoc meeting was to explore the possibility to abandon Central Crop Databases (CCDBs) and focus on EURISCO, in order to avoid duplication of data and working efforts. The main facilities of both systems were discussed. Current progress concerning the selection of Most Appropriate Accessions (MAAs) was reviewed for Solanaceae and Cucurbits and possible plans for the new phase of ECPGR (Phase IX, 2014-2018) were discussed.
Check title to add to marked list
<< previous | next >>

Show 20 50 100 records per page

Please log in to use this service. Login as Wageningen University & Research user or guest user in upper right hand corner of this page.