Fangorn and comma delimited files

Fangorn as a conversion program for ASCII input material was explained in paragraph 5.1. The Fangorn version from 1992 can also convert files in comma delimited format, which is a standard output format from dBase or dBase-like files (see paragraph 2.1.3.).

Although there are a few good programs to convert dBase databases, you may encounter problems with databases which were created by more recent versions of dBase than version III or III Plus. In these cases Fangorn may solve the problem.

To make a comma delimited output of a dBase database, use this dBase command:

  copy all to books.txt delimited
(the name "books" can of course be replaced by any kind of name).

You can limit the number of records to be outputted with other dBase commands, e.g.

  go top
  copy next 100 to books.txt delimited
or
  copy all for language="French" to books.txt delimited
and so on. In fact, with lage databases it is wise to make a copy of the whole database and a copy of a small part of it in order to test the conversion specifications.

As it is very difficult to recognize the structure of the input file in the comma delimited file, you should always try to get a listing of the original structure and if possible a print out of at least one typical record in its original format. This will enable you to analyse the input material.

The actual procedure to convert comma delimited files with Fangorn is similar to conversion of ASCII material. There are only two differences:

  1. In the specification file you must only fill in the line
      Text indicationg start/end of record   :
    with "@@delim@@" to tell Fangorn that a comma delimited file is to be converted.
  2. In the description of each field you are not allowed to fill in the first line:
      Tag in incomming file                  :
    A comma delimited files does not have any tags. Fangorn will juist take one field after another.
Of course you can profit of all the facilities Fangorn offers to change strings, to create occurrences of repeated fields and to distinguish subfields.

This could be an example of a dBase record:

                                                                                
TITLE       Windows for Workgroups made easy 
AUTOR1      Sheldon, Tom              
AUTOR2      Sheldon, Jim              
AUTOR3                               
CITY        Osborne                 
PUBLISHER   McGraw-Hill             
YEAR        1993                   
PAGES              550              
KEYWORDS    computers; operating systems; Windows             
To convert it, this specification file could be used:
                        SPECIFICATION FILE
Which conversion is specified         :
Number of fields                      :9
Text indicating start/end of record   :@@delim@@
Tag indicating start/end of record    :
Line indicating start/end of record   :
Two texts indicating start/end of rec.:

Tag in incoming file                  :
Tag in ISO-2709 file                  :1
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :
Subfield delimiter ISO-2709 file      :
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :2
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :,
Subfield delimiter ISO-2709 file      :ab
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :3
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :,
Subfield delimiter ISO-2709 file      :ab
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :4
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :,
Subfield delimiter ISO-2709 file      :ab
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :5
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :
Subfield delimiter ISO-2709 file      :
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :6
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :
Subfield delimiter ISO-2709 file      :
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :7
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :
Subfield delimiter ISO-2709 file      :
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :8
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :
Subfield delimiter ISO-2709 file      :
Delimiter occurrences incoming file   :
Texts to replace                      :

Tag in incoming file                  :
Tag in ISO-2709 file                  :9
(The following is not mandatory)
Spaces in continuation lines          :
Subfield delimiter in incoming file   :
Subfield delimiter ISO-2709 file      :
Delimiter occurrences incoming file   :;
Texts to replace                      :
Finally we will need a FST like the following one to import the ISO 2709 file into the test database (which uses here the CCF tags):
 22 0 '^a19980110'
200 0 "^a"v1
300 0 v2/,v3/,v4/
400 0 "^a"v5,"^b"v6
440 0 "^a"v7"0000"
460 0 "^a"v8" p"
620 0 (|^a|v9/)
This will result in this CDS/ISIS record (here displayed with the biult-in display format ALL):
22: ^a19980110
200: ^aWindows for Workgroups made easy
300: ^aSheldon^bTom
300: ^aSheldon^bJim
400: ^aOsborn^bMcGraw-Hill
440: ^a19930000
460: ^a550 p
620: ^acomputers
620: ^aoperating systems
620: ^aWindows
Although this can be a good alternative in cases where programs like DBTOISIS and DB3ISO will not work, comma delimited files should be avoided as input material, for several reasons:
  1. The data may contain the characters which are use to separate the fields: brackets and commas, e.g.
    The difference between "Edit record", "Edit last search result" and "Recall last record modified" in CDS/ISIS".
  2. dBase memo fields are omitted in an output to the comma delimited format.
It may be better to make an ouput of the dBase file by means of the DUMP program (see paragraph 6.6.3.). Afterwards this text file can be converted by Fangorn as any kind of tagged ASCII file (see paragraph 5.2.).
© Piet de Keyser, 1998

Piet de Keyser's Manual Collection