CCF CONVERTER Version 1.5 USER MANUAL Including instructions for creating ISO 2709 records and for the construction of conversion tables January 1991 CONTENTS Page 1 INTRODUCTION 1 1.1 Availability 1 1.2 Assistance 1 1.3 Further documentation 1 2 FILES AND FILENAMES 2 2.2 Filenames 2 3 RUNNING THE PROGRAM 2 3.1 Commands 3 3.2 Explanation of commands 3 Table 3 Convert 3 Subfield 4 Isis 5 ISO2709 5 Display 6 Print 6 Verify 7 Input 7 System 8 Help 8 Stop 8 4 FILE CONVERSION 8 4.1 Conversion tables 8 4.2 Conversion process 9 4.3 Conversion in verify mode 9 5 PROCESS CODES 10 6 SCREEN MESSAGES 13 7 SOURCE CODE 14 1 INTRODUCTION This document is the user manual for CCF Converter, a computer program which converts bibliographic records from one standard format to another. The specific major purposes of the program are to convert records in UNIMARC format to Unesco's Common Communication Format (CCF) and vice-versa. For that reason, a number of special processes for those conversions are included in the program, in addition to processes for converting ISO 2709 records to the CDS/ISIS format. Special processes for other kinds of ISO 2709 conversions can be added as needed. The computer program, which is ready to use, runs on all MS-DOS (PC-DOS) computers. A UNIX version is also available. 1.1 Availability The program is available from Unesco without charge. For a free copy write to: Division of the General Information Programme Unesco 7 place de Fontenoy 75700 PARIS France 1.2 Assistance Questions about the operation of the program should be addressed to: Prof. Peter Simmons School of Library, Archival and Information Studies University of British Columbia Vancouver, B.C. V6T 1Z1 Canada Internet: simmons@unixg.ubc.ca Envoy100: PA.Simmons Telephone: 604-822-4259 Fax: 604-822-6465 Telex: 0451233 1.3 Further documentation More technical documentation for programmers is found in comments embedded within the program. 2 FILES AND FILENAMES 2.1 Files The following files are distributed as part of this program package. CCF.EXE is the execution program. RESOURCE.DM is a text file containing screen messages. CCF.DOC is a text file containing this user manual. UNICCF.TAB contains a conversion table to change records from UNIMARC to Unesco's Common Communication Format. CCFUNI.TAB contains a conversion table to convert records from CCF to IFLA's UNIMARC format. UNI.D5 contains six UNIMARC records re-keyed from examples shown in the UNIMARC Manual. CCF.D5 contains the six records in UNI.D5 converted to the CCF format. CODE is the name of a sub-directory containing the source code and object modules. 2.2 Filenames In this release of the program no significance is attached to filenames or filename extensions. The user may choose any filename and any filename extension. In every case in this manual where a filename occurs it may be preceded by a pathname. It is not necessary that files containing records, conversion tables, and programs exist in the same subdirectory. 3 RUNNING THE PROGRAM Starting the program with the DOS command CCF brings up the display screen consisting of the program greeting and the program prompt. The prompt waits for a command. All commands and other screen messages shown in this manual are the default English-language messages. As indicated in the section entitled Screen Messages, all messages appearing on the computer monitor may be changed to suit the user. 3.1 Commands The following commands are possible: Table - loads a conversion table into memory Input - allows input of records from DOS files Convert - converts data from one format to another Subfield - determines the displayed subfield code Isis - converts SO 2709 records to the CDS/ISIS format ISO2709 - converts ISIS records to ISO 2709 format Display - displays a file of records on the screen or in a file Print - prints a file containing records or a conversion table Verify - sets or clears verification mode Input - converts records in a text file to the target format System - executes DOS commands while the program is running Help - displays the list of commands Stop - stops execution of the program Commands may be abbreviated by keying only the capital letters shown in the list above. 3.2 Explanation of commands The section below gives an explanation of each command and examples of its use. TABLE This command loads a conversion table. For example TABLE UNICCF.TAB The conversion table stays loaded until another table is loaded or the program is stopped. Conversion table files are text files which may be created and modified with any editor. Rules for the format of conversion table files appear in the section entitled File Conversion. CONVERT This command is used to convert records from one ISO 2709 format to another, or between an ISO 2709 format and CDS/ISIS format. It is not possible to convert from one CDS/ISIS format to another; for example, a CDS/ISIS record with MARC tags cannot be converted to a CDS/ISIS record with CCF tags. This command can be used only after a conversion table is loaded with the TABLE command. For example: TABLE UNICCF.TAB CONVERT UNI.D5 CCF.D5 With these commands first the UNICCF conversion table is loaded, then the records in the file UNI.D5 are converted and sent to the file CCF.D5. The input file must have at least one record in it. If it has two or more records, they may be separated by a space or End Of Line (RETURN) character. There is no limit to the number of records in a file. If the output file does not exist it will be created automatically by the program . If the output file does exist it will be overwritten by the program. When the conversion process ends a report is displayed showing the number of fields and records converted. Verify mode: When the VERIFY command has not been given, conversion continues until all records in the source file have been converted. Then a report is displayed showing the fields and number of records converted. When the VERIFY command has been given, each target field is shown at the time it is written and the conversion stops after the screen is full to give the user the chance to stop the conversion process. The ENTER key causes the conversion to continue; the N key will stop conversion at that point. As described in the section entitled Conversion in Verify Mode, the display that appears as the record is being converted does not always match the record in its fully converted form. SUBFIELD This command instructs the program which character is to be displayed in subfield identifiers. If this command is not used the program displays all records using the @ sign as the default subfield flag. In ISO 2709 records the code that is used as a subfield flag (the first of the two characters that form a subfield identifier) has no standard graphic representation assigned to it and therefore appears in various ways in different formats and different countries. In various implementations of ISO 2709 this character may appear as an 'at' sign (@), a dollar sign ($), a carat mark (^), a vertical bar (|), etc. By custom, CCF records use the @ sign as a subfield flag, while MARC records typically use the $ sign. Some agencies substitute another character for their own convenience. Use of the SUBFIELD command prior to DISPLAY determines how the subfield flags will be displayed. For example, the following commands SUBFIELD # DISPLAY UNI.D5 will cause the records in the file UNI.D5 to appear with the # sign at the beginning of every subfield in the record until the command is used again or the program is stopped. The SUBFIELD command may also be used before the INPUT command so that agencies creating ISO 2709 records with the CCF Converter may display records with a subfield flag of their own choosing. This command does not change the content of the record; it affects only the representation of the record in subsequent INPUT, DISPLAY and PRINT commands. ISIS This command is used to convert ISO 2709 formatted records to CDS/ISIS records. These are used only by the computer program Mini/micro CDS/ISIS which is distributed by Unesco. The command is used without reference to a conversion table, since the field designators (tags) and subfield identifiers are not changed in any way. A typical use of the command would be ISIS FILEONE FILETWO where FILEONE represents the name of a file of one or more ISO 2709 records and FILETWO represents the name of the file that is to receive the CDS/ISIS records. The sole purpose of this command is to create files of records for importation into Mini/micro CDS/ISIS. CDS/ISIS records cannot be displayed by the CCF Converter. CDS/ISIS records differ from ISO 2709 records in the following ways: 1. ISIS records use the ^ character as a subfield flag. The ^ character is not merely the displayed character; it forms part of the content of the record. 2. ISIS records use the # character both as a field terminator and a record terminator. The # character is not merely the displayed character; it forms part of the content of the record. 3. ISIS records are divided into 80-character segments; return/linefeed characters appear after every 80th character and after the end-of-record mark. 4. ISIS will only accept for import (and therefore can only export) records whose Implementation Defined Section of the directory, as specified in label character position 22, is 0 (zero). ISO2709 The conversion of records from CDS/ISIS format to ISO 2709 is also included in the CCF a typical command would be ISO FILEONE FILETWO Since Mini-micro CDS/ISIS has the ability to output ISO 2709 records, the CCF Converter, is not the only way to accomplish this conversion. Note that it will not be possible to convert a CDS/ISIS record into a multi-=segment CCF record, since the CDS/ISIS record cannot contain the codes required for segment indicators. Also, any ^ character existing within the record will be converted to a subfield flag. DISPLAY This command shows ISO 2709 records in 'tabular' display rather than as they exist in their respective special formats. It can also be used to display conversion tables. It cannot be used to display CDS/ISIS records. Displayed items may be directed to the monitor or to a file. Display stops after each 22 lines and gives the message At this point the user may key N or n for a negative answer, in which case the display will stop. The ENTER key causes the display to continue. The command DISPLAY CCF.D5 displays the contents of the file CCF.D5 on the screen. There are three features of displayed records worth noting: 1 The contents of the 24-character label appear as a field with the tag 000. 2 The subfield flags will be displayed as the default '@' unless the SUBFIELD command has been used to change the display. 3 The format in which displayed records are shown is the format in which records must appear for use with the INPUT command. The command DISPLAY CCF.D5 MYFILE.TXT sends the display of records held in file CCF.D5 to file MYFILE.TXT, where it is accessible to a text editor. PRINT The rules for this command are exactly the same as those for the command DISPLAY, except that by default this command sends the table or records to be printed to the device PRN. Typical commands are PRINT CCF.REC to print the records in a file called CCF.REC, or PRINT TABLE to print the currently loaded table. In certain ways, PRINT works exactly as DISPLAY does. For example, the command PRINT CCF.REC MYFILE sends the display of records held in file CCF.REC to file MYFILE. Full MS-DOS The name of a device or port may also be added as a parameter. The command PRINT CCF.REC COM1 or PRINT CCF.REC LPT1 will direct output to the designated port. VERIFY This command invokes Verify Mode, which permits the user to monitor events as they occur. Once the command is given, Verify Mode stays loaded until the command VERIFY OFF is given or the program is stopped. The use of VERIFY before the convert command permits the user to view target data elements as they are created. However, because the conversion of source subfields initially produces target subfields which may not match the final form of the record VERIFY should be left off or turned off before the CONVERT command is used. INPUT This command allows display-format records created with a word processor or text editor to be converted to ISO 2709 form. A typical command would be INPUT FILE1. TXT CCF.REC where FILE1.TXT is the name of a file containing one or more display- format records, and CCF.REC is the name of the destination file. Within the source file used for input the follow rules apply: 1 Records should appear as they do in the display format produced by the CCF Converter. 2 Characters which are to appear in the label should be keyed into a 24-character field tagged 000. Where calculated values are required (for the length of record and base address of data) random characters (such as 99999 or xxxxx) or blanks may be keyed. These will be replaced with the appropriate calculated values when the record is created. 3 Where a segment indicator for a CCF record is not provided, the CCF Converter will supply a default value of 0. 4 Repetition counters for CCF records may be omitted; the CCF Converter will supply these, overriding any values which may have been input. 5 Any line which has a character in column 1 will be treated as the start of a new field. A line which begins with a blank will be treated as the continuation of the field in the line above. 6 Every record must end with a blank line, regardless of the number of records in the file. 7 Records may be created in any format based on ISO 2709. The format of the input record is determined by the loaded table at the time of input. 8 Any character may be used as a subfield indicator flag provided the SUBFIELD command is used before records are input. Subfield codes may be in upper or lower case as stipulated by the individual format. SYSTEM This command allows the user to carry out a DOS command, for example to display a table, or a disk directory, or run another program. For example, SYSTEM DIR /P shows the current directory. SYSTEM TYPE UNI.D5 will cause DOS to display the contents of the file UNI.D5. When the action caused by the command has terminated, the user is returned to the CCF Converter. HELP The HELP command causes a list of commands and a brief description of each to be displayed on the screen. The information displayed appears in the RESOURCE.DM file; it can be edited with a text editor to suit the computing environment. STOP This command halts execution of the program and returns the user to the operating system. 4 FILE CONVERSION The conversion of one or more records requires that a conversion table be created and loaded before the CONVERT command is given. This section explain the rules for creating conversion tables that work with the CCF Converter. 4.1 Conversion tables Within the conversion table file each 'record' (a single line) may contain six variable-length fields, separated by vertical bars. These are: source tag | subfield | process code | target tag | subfield | data The source tag or target tag can be omitted if it is to be the same as the last given. Process codes may be included, but are not required. If the code is missing, it is assumed that there is none. The valid process numbers are listed in the section entitled Process Codes. If anything appears in the 'data' field, for 'normal' processes it is added to the front of the source data before placing the result into the target. The source tags do not have to occur in order, but all subfields for a particular source tag must occur together. The vertical bars have no fixed location; they may appear anywhere on the line. For some special processes, the fields are redefined. See the section entitled Process Codes. If the source subfield code is '*', then the data part of the record is put into the target field. This is only done once for each source field that is processed. Any line in the table file that starts with an asterisk ('*') is considered to be a comment and is ignored. For example: 001 | | 14 *----------------------------------- 010 | a | 2 | 100 | A | b | 2 | | C | d | 2 | | C *------------------------------------ 020 | a | 2 | 111 | B | b | 9 ... etc ... In this case the first line consists only of a source tag (001) and a process (14). The second line starts with an asterisk so it is treated as a comment; it is included only for the convenience of the person editing the table. The third line contains five of the six permitted fields; only the field containing constant data has been omitted. This line specifies that subfield a of source field 010 must be converted to subfield A of target field 100, using process number 2. On the fourth line both the source tag and the target tag are omitted since they have not changed from the previous line. Each line could be moved to the left and made more compact, but it is shown as it appears for the convenience of people. Thus lines one to four could optionally appear as 001 | | 14 010 | a | 2 | 100 | A | b | 2 | | C If an attempt to load a table fails, then there is no table command and for printing the table. 4.2 Conversion process The CONVERT command is the one that uses the currently loaded table to convert the items in one file into another. The program converts each item in the source file in sequence starting with the first. For each item, it gets all fields in the order they occur in the source record and converts them according to the currently loaded table. Every source item will produce a target item if it appears in the conversion table. Where a source field does not appear in the conversion table it is thrown away, appearing in the Verify Mode display between multiple asterisks. 4.3 Conversion in Verify Mode If Verify Mode has been set conversion will pause between items and the target fields will be displayed as they are being created. However, because the sequence is that of the source record, a target field may appear more than once. For example, when source field 111 creates target field AAA, the target field will be displayed. If source field 999 later results in adding more data to target field AAA, the target field will appear again in its latest form. Thus the Verify Mode display does not always show the final form of the resulting target field. Also, target fields shown during conversion are not ordered by tag since sequencing of target fields is done at the end of the conversion process. The actual resulting target record can be seen with the DISPLAY or PRINT commands. 5 PROCESS CODES In the conversion of records from one format to another source subfields do not always have a one-to-one relationship to target subfields. Therefore the program includes a number of special processes which may be invoked as needed. This is done by specifying the number of the process, the 'process code', in the appropriate place in the conversion table. Each process is designed to accomplish a task for a specific kind of conversion. Many, but not all, of the processes currently provided are used only for converting UNIMARC records to the CCF format. The following list includes some codes which accomplish the same thing as others. A few codes at the start of the table have been re-stated later in the table to provide a better explanation. All codes are available for the user to build a new conversion table or edit an existing one. The process codes are: 1 If the target tag changes and the new target field already exists, throw away this subfield and all of the rest of the source field. 2 If the target tag changes, start a new field. Always start a new subfield. 3 If the target tag changes, start a new field. Add this subfield to the end of an existing subfield (if any). 4 Convert embedded UNIMARC fields to CCF secondary segments. This is an alternative to process Code 6, which creates linked, single-segment records. The first target subfield in the table is the place to put the 001 data if any. The rest of the lines in the table for this source field give the other fields that are to be generated and the constant data to be placed within them. Once the constant fields have been generated, the rest of the data is parsed on the $1 subfield and each embedded field is processed as if it was a direct field from the source item. All of the resulting fields are treated as they are elsewhere in the table, except that they go to a segment whose number is one greater than any other created for this item. For example, when converting a UNIMARC record to a segmented CCF record, the data: 410 00$10011234$120000$aThe title. and the table entry: 410 | | 4 | 010 | A | | | 015 | | @A02 | | | 080 | | @A02@B0 | | | 510 | | @Aseries of target item produces: 01020 00@A1234 01520 00@A02 08020 00@A02@B0 51020 00@ASeries of target item 20020 00@AThe title. 5 If the target tag changes, then add the constant data to the end of any existing field with same tag. If the subfield already exists, then throw the source subfield away. 6 Convert embedded UNIMARC fields to single-segment, linked CCF records using the 088 field developed for CCF/B. This is an alternative to Code 4, which creates multi-segment CCF records. In all target records, the segment identifier will be set to zero. The program considers the primary target record to be the "parent"; other records created by this process become "children" The first child record will have as its identifier the identifier of the parent record with -1 added, and so on for each additional child record. The bibliographic level in the label (character posaition 7) will be set at 'u' (for 'unknown') for all child records. CCF users should be warned that this process can create records which lack mandatory CCF fields. Every record will have an 088 field. These will be created in matched pairs, pointing to each other from the parent record to and from each child record. The codes contained in subfield B of these 088 fields, which are determined in the conversion table according to the tag of the source record, will be reciprocal, as follows. IF the parent field 088, subfield B THEN the child field 088 field B contains contains 01 02 02 01 11 12 12 11 13 13 21 22 22 21 25 26 26 25 31 32 32 31 33 33 24 34 35 02 Any data tagged 001 in the source field will be converted to field 011 in the target record. All other embedded fields will be converted as determined elsewhere in the table. 7 Always cause a new target field to be created. This is used for a repeating source subfield that generates a repeating target field. 9 Throw this data away. If this code is given in the table for the first line of a source field, then the whole field is ignored. 10 Convert the 24-character label of the item. Note that it appears to be a field with the tag "000". 11 Modify formalized dates from the UNIMARC 005 field. 12 Turn the UNIMARC 100 field into a number of data elements. Positions 38, 39 are saved and are generated as a @S subfield of the first CCF 200 field. 13 Turn data from the UNIMARC 105 field into CCF type of material codes (field 060). 14 Pass the source field directly to the target with no conversion. This is used for the 001 field. 15 Convert the UNIMARC 110 field. This takes the following table entries and uses them to convert CP1. CP2, CP3 and CP7 are used to conditionally produce CCF 060. 16 A routine to convert from the UNIMARC 122 field. 17 Test some data of the field and if the condition is true, continue processing the table. If it is not true, then it is as if the field did not match the source field code and the program will search the table for another entry for this field. 18 For converting from the UNIMARC 604$1 subfield. This converts all embedded fields to CCF 620 and all their subfields to @A. 19 Process the CCF 030 field for conversion to fixed character positions in the UNIMARC 100 field. The process convertes the data as shown in the conversion table below, and moves the resulting data into fixed character positions in the target field 100, as follows: Source Target @A code 9 @B cp 26-27 @C cp 28-28 @D cp 30-31 @E cp 32-33 @F code 9 @G code 9 Any character position in the target field 100 which does not receive data from a source field will be filled with zeroes. The values in these subfields are converted as follows: Source Target 2 01 37 02 53 03 54 04 55 05 1 or 67 code 9 (since ISO 6630 is the UNIMARC default) 21 Always start a new target field, and therefore a new subfield. This is the same as Code 2 22 If the target subfield exists, start a new field. Otherwise add the source subfield to the existing target field. This is the same as Code 7. 23 If the target subfield exists append the source data to the existing target data. This is the same as Code 3. 24 If the target subfield exists start a new target subfield within the existing field. 25 If the target subfield exists throw the source data away. 26 Throw this source subfield away and all the remainder of the source field. This is the same as Code 9. 27 If the target field exists throw the source subfield away. 28 If the target field exists throw this source subfield away and all the remainder of the source field. 30 This is a special process for converting the CCF label to UNIMARC. If source cp=7 the entire record is discarded. 31 This moves the second indicator of the CCF 021 field into cp17 of the UNIMARC label. 32 This builds UNIMARC field 100. 33 The content of CCF field 023 is moved into UNIMARC field 005. 34 The contents of CCF field 060@A is moved to UNIMARC label cp6. 35 This complex process converts CCF linked fields to UNIMARC linked fields. See the notes in CCFUNI.TAB for an explanation. 36 Indicator 1 in CCF field 240 is converted to UNIMARC field 500 indicators 1 and 2, as shown in CCFUNI.TAB. 6 SCREEN MESSAGES All of the messages which appear on the screen during operation of the program appear by default in English. If a file named RESOURCE.DM appears in the same directory as CCF.EXE, the program examines that file to determine the text of each message. Therefore changing the contents of RESOURCE.DM will result in changing the messages which appear on the screen. The format of each line in the file is A +DAT B, where A is the name of the routine (program segment) that uses the command or produces the message, +DAT is a fixed element with a single space before and after it, and B is the text of the command. Neither the name of the routine (part A) nor the +DAT element should be changed. A message may contain a percent sign (%); this represents a variable and should not be changed. The remainder of part B is the wording of commands and messages. These may be changed with any text editor or word processor. Command names appear in the file with a dash (-) and a message. Removal of a command message will prevent the command from appearing on the screen, although the command will still work. For example, the command QUIT will stop execution of the program because it appears as a command in the RESOURCE.DM file. But it never appears on lists of commands because it has no comment. The command QUIT as shown in RESOURCE.DM is also an example of how a command or message may be provided in more than one form. By default the program is stopped with the command STOP. The user can provide the ability to also stop the program with QUIT, LOGOFF and END. This is done by duplicating and editing the line COM.COM.STOP +DAT Stop - terminates the program so that there are four lines: COM.COM.STOP +DAT Stop - stops the program COM.COM.STOP +DAT Quit COM.COM.STOP +DAT Logoff - stops the program COM.COM.STOP +DAT End - terminates the program Letters of the command appearing in capitals may be input as an abbreviation. In these examples the first letter serves as an abbreviation, since there are no other commands beginning with Q, L or E. The letter S may be used as an abbreviation for STOP because SYS is used for the command SYSTEM and SUB for SUBFIELD. Before editing this file, be sure to make a safe copy of the original version. 7 SOURCE CODE The CCF Converter originated as a number of routines which form part of a large, integrated automation system developed at the University of British Columbia in Vancouver, B.C., Canada. For this reason programmers who examine the source code may find references to functions not implemented in this program. The source files used to create the program reside in the subdirectory CODE. They were compiled with the the Turbo C compiler. CCF CONVERTER 1.5 {PAGE|14} _________________________________________________________________ {PAGE|3}