Since it became a common task to map MARC data to arbitrary formats, these mappings are usually based on a set of rules, defining what data of a MARC record should be accessed. These rules are commonly called MARC field specification.
There are already several tools that each use their own and therefore different MARC field specification. The hereby described specification MARCspec is an approach to normalizing such field specifications in terms of unification and interchangeability.
Commit history:
2017-11-20 09:29:51 +0100
: errata and section changes2017-11-16 12:23:19 +0100
: altered status of this document2017-11-16 11:39:09 +0100
: dropped indicators in favour of indicatorSpec2016-06-14 11:13:36 +0200
: at least one subfieldSpec for variable field2015-11-30 08:25:21 +0100
: subfieldTag is now subfieldCodeThe current version of this specification has a beta status. It is already implemented in various tools for different programming languages. All implementers and interested partys are welcome to give Feedback!
The keywords ‘MUST’, ‘MUST NOT’, ‘REQUIRED’, ‘SHALL’, ‘SHALL NOT’, ‘SHOULD’, ‘SHOULD NOT’, ‘RECOMMENDED’, ‘MAY’, and ‘OPTIONAL’ in this document are to be interpreted as described in RFC 2119.
See also Definition of MARC related terms used in this spec.
Machine-Readable Cataloguing (MARC) is a document based exchange format for bibliographic and other library related data. A MARC record consists of three main sections: the leader, the directory, and the variable fields with the data content.
There are two kinds of (variable) fields: (variable) control fields and (variable) data fields. The term fixed field stands for fields whose length does not vary like the leader and some control fields. The field content in the fixed fields can be accessed through its character position or range. Only data fields are divided into subfields. Subfields can also be contextualized through indicators. There is an indicator 1 and an indicator 2 for all data fields, both are optional.
A MARCspec is a reference to field data of a MARC record and is very much like XPath for XML. With MARCspec one can reference data on different levels of a MARC record defined through the fields, character positions, subfields and indicators.
The data of the MARC record being referenced may be represented through a set of data, having zero or more data elements. MARCspec does neither define the form of this referenced set of data, nor the encoding of the referenced data content.
A MARCspec might not fulfil all requirements of definition for a reference to the desired set of data like XPath does for XML. This is because of the nearly unlimited number of options accessing data in a MARC record, especially when it comes to delimiters based on cataloging rules. Thus a MARCspec has to concentrate on the basic references and let all other data processing to subsequent data processing functions of tools having implemented MARCspec.
To enable support for other ISO 2709 applications MARCspecs syntax does not distinguish between types of fields like in the MARC record structure. A valid MARCspec might violate the MARC record structure. It is led to the MARCspec aware tools weather to check for MARC record structure violation or not.
A MARCspec allows the following basic references:
References a MARCspec does not allow are:
This section is normative.
The primary form of a MARCspec is a string.
The Augmented BNF for Syntax Specifications: ABNF RFC 5234 is used to define the form of the MARCspec as string.
The whole ABNF for MARCspec shows as follows
alphaupper = %x41-5A
; A-Z
alphalower = %x61-7A
; a-z
DIGIT = %x30-39
; 0-9
VCHAR = %x21-7E
; visible (printing) characters
positiveDigit = %x31-39
; "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
positiveInteger = "0" / positiveDigit [1*DIGIT]
fieldTag = 3(alphalower / DIGIT / ".") / 3(alphaupper / DIGIT / ".")
position = positiveInteger / "#"
range = position "-" position
positionOrRange = range / position
characterSpec = "/" positionOrRange
index = "[" positionOrRange "]"
fieldSpec = fieldTag [index] [characterSpec]
abrFieldSpec = index [characterSpec] / characterSpec
subfieldChar = %x21-3F / %x5B-7B / %x7D-7E
; ! " # $ % & ' ( ) * + , - . / 0-9 : ; < = > ? [ \ ] ^ _ \` a-z { } ~
subfieldCode = "$" subfieldChar
subfieldCodeRange = "$" ( (alphalower "-" alphalower) / (DIGIT "-" DIGIT) )
; [a-z]-[a-z] / [0-9]-[0-9]
abrSubfieldSpec = (subfieldCode / subfieldCodeRange) [index] [characterSpec]
subfieldSpec = fieldTag [index] abrSubfieldSpec
abrIndicatorSpec = [index] "^" ("1" / "2")
indicatorSpec = fieldTag abrIndicatorSpec
comparisonString = "\" *VCHAR
operator = "=" / "!=" / "~" / "!~" / "!" / "?"
; equal / unequal / includes / not includes / not exists / exists
abbreviation = abrFieldSpec / abrSubfieldSpec / abrIndicatorSpec
subTerm = fieldSpec / subfieldSpec / indicatorSpec / comparisonString / abbreviation
subTermSet = [ [subTerm] operator ] subTerm
subSpec = "{" subTermSet *( "|" subTermSet ) "}"
MARCspec = fieldSpec *subSpec / (subfieldSpec *subSpec *(abrSubfieldSpec *subSpec)) / indicatorSpec *subSpec
Every MARCspec is either a spec for field data, subfield data or indicator values. All specs can be contextualized through subSpecs (see section SubSpecs).
MARCspec = fieldSpec *subSpec / (subfieldSpec *subSpec *(abrSubfieldSpec *subSpec)) / indicatorSpec *subSpec
A fieldSpec is a reference to field data of a field. It consists of the three character field tag, followed optionally
The field tag may consist of ASCII numeric characters (decimal integers 0-9) and/or ASCII alphabetic characters (uppercase or lowercase, but not both) or the character .
. The character .
is interpreted as a wildcard. E.g. ‘3..’ is then a reference to the data elements in all fields beginning with ‘3’.
The special field tag LDR
is the field tag for the leader.
alphaupper = %x41-5A ; A-Z
alphalower = %x61-7A; a-z
DIGIT = %x30-39; 0-9
fieldTag = 3(alphalower / DIGIT / ".") / 3(alphaupper / DIGIT / ".")
fieldSpec = fieldTag [index] [characterSpec]
0
and the ending index #
(see Reference to occurrence examples).
Reference to field data of the leader.
LDR
Reference to all field data of fields having a field tag starting with 00.
00.
Reference to all field data of fields having a field tag starting with 7.
7..
Reference to data elements of all repetitions of the ‘100’ field.
100
A characterSpec is a reference to a character or a range of characters within a field or subfield. It consists of a position or range prefixed with the character /
.
characterSpec = "/" positionOrRange
A positionOrRange is either a postion or a range.
The postion is either a positive integer or the character #
as a symbol for the last character of the referenced data content.
The range consists of two positions concatenated with the character -
.
positiveDigit = %x31-39
; "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
positiveInteger = "0" / positiveDigit [1*DIGIT]
position = positiveInteger / "#"
range = position "-" position
positionOrRange = range / position
#
as a symbol for the last character of the referenced data content (see MARCspec interpretation for implicit rules).
Reference to substring of field data in the leader from character position ‘0’ to character position ‘4’ (5 characters).
LDR/0-4
Reference to data in the leader at character position ‘6’ (1 character).
LDR/6
Reference to data in the control field ‘007’ at character position ‘0’ (1 character).
007/0
Reference to all data but the first character in the control field ‘007’.
007/1-#
Reference to the last character in the control field ‘007’.
007/#
Reference to the last two characters of the value of the subfield ‘a’ of field ‘245’.
245$a/#-1
The subfieldSpec is a reference to the data content (value(s)) of (a) subfield(s) of a variable field. It consists of
abrSubfieldSpec = (subfieldCode / subfieldCodeRange) [index] [characterSpec]
subfieldSpec = fieldTag [index] abrSubfieldSpec
A subfieldCode is a subfieldChar prefixed by the character $
.
A subfieldCodeRange is prefixed by the character $
and restricted to either two alphabetic or two numeric characters both concatenated with the character -
.
A subfieldChar is a lowercase alphabetic, a numeric character or a special character.
subfieldChar = %x21-3F / %x5B-7B / %x7D-7E
subfieldCode = "$" subfieldChar
subfieldCodeRange = "$" ( (%x61-7A "-" %x61-7A) / (%x30-39 "-" %x30-39) )
Reference to value of the subfield ‘a’ of field ‘245’.
245$a
Reference to the value of the subfields ‘a’, ‘b’ and ‘c’ of field ‘245’.
245$a$b$c
Same as above, but with the use of a subfield code range.
245$a-c
Reference to values of subfields ’_‘and’$’.
...$_$$
For repeatable fields and subfields each occurrence can be referenced by its index. An index is a position or range enclosed with the characters [
and ]
. The first repetition of a field or a subfield is always referenced with the index [0]
. The last repetition of a field or a subfield is referenced with the index [#]
.
index = "[" positionOrRange "]"
Reference to the first ‘300’ field.
300[0]
Reference to the second of the ‘300’ field.
300[1]
Reference to the first, second and third of the ‘300’ field.
300[0-2]
Reference to all but the first of the ‘300’ field.
300[1-#]
Reference to the last of the ‘300’ field.
300[#]
Reference to the last two of the ‘300’ field.
300[#-1]
Reference to value of the subfield ‘a’ of the first ‘300’ field.
300[0]$a
Reference to the value of the first subfield ‘a’ of the field ‘300’
300$a[0]
Reference to the value of the last subfield ‘a’ of the field ‘300’
300$a[#]
Reference to the value of the last two repetitions of subfield ‘a’ of the field ‘300’
300$a[#-1]
An indicatorSpec is a reference to the value of either indicator 1 or indicator 2 of a variable field. It consists of a field tag, followed by
abrIndicatorSpec = [index] "^" ("1" / "2")
indicatorSpec = fieldTag abrIndicatorSpec
Reference to value(s) of indicator 1 of all occurrences of field ‘880’.
880^1
Reference to value of indicator 2 of first repetition of field ‘880’.
880[1]^2
With a subSpec the preceding fieldSpec, subfieldSpec or indicatorSpec gets contextualized. Every subSpec MUST be validated either true or false. Is a subSpec true, the corresponding spec (the last spec outside of the subSpec) is used to reference data. Is a subSpec false, the corresponding spec does not reference data.
A subSpec is enclosed with the characters {
and }
. A subSpec consists of one or more sets of subTerms (the left hand subTerm and the right hand subTerm) and an operator. This combination of subTerms and operator can be chained through the character |
(OR) within a subSpec. Multiple subSpecs can also be repeated one after another (AND).
subTerm = fieldSpec / subfieldSpec / indicatorSpec / comparisonString / abbreviation
subTermSet = [ [subTerm] operator ] subTerm
subSpec = "{" subTermSet *( "|" subTermSet ) "}"
The operator is one of
=
(as a symbol for ‘equal’),!=
(as a symbol for ‘unequal’),~
(as a symbol for ‘includes’),!~
(as a symbol for ‘not includes’)!
(as a symbol for ‘not exists’) or?
(as a symbol for ‘exists’).
operator = "=" / "!=" / "~" / "!~" / "!" / "?"
A subTerm is one of
By omitting the left hand subTerm, this implicitly makes the corresponding spec outside the subSpec the left hand subTerm (see MARCspec interpretation for implicit rules). For subSpecs with omitted left hand subTerm the operator can also be omitted. Omitting the operator implies the usage of the operator ?
(exists).
Checking existence of fields
Reference data content of subfield ‘c’ of field ‘020’, if subfield ‘a’ of field ‘020’ exists.
020$s{?020$a}
same as
020$c{020$c?020$a}
Checking (non) existence of fields
Reference data content of subfield ‘z’ of field ‘020’, if subfield ‘a’ of field ‘020’ does not exist.
020$z{!020$a}
same as
020$z{020$z!020$a}
A comparisonString can be every combination of ASCII characters prefixed by the \
character. For unambiguousness in a comparisonString the following characters MUST be escaped by the character \
:
$
{
}
!
=
~
?
|
In a comparisonString a whitespace MUST be encoded as the character combination \s
.
comparisonString = "\" *VCHAR
Checking dependencies via string comparison
If Leader/06 = t: Books
Reference to character with position ‘18’ of field ‘008’, if character with position ‘06’ in Leader equals ‘t’.
008/18{LDR/6=\t}
Checking dependencies via string comparison alternatives
If Field 007/00 = a and t
Reference to subfield ‘b’ of field ‘245’, if character with position ‘0’ of field 007 equals ‘a’ OR ‘t’.
245$b{007/0=\a|007/0=\t}
Checking dependencies via string comparison chains
If Leader/06 = a and Leader/07 = a, c, d, or m: Books
Reference to character with position ‘18’ of field ‘008’, if character with position ‘06’ in Leader equals ‘a’ AND character with position ‘07’ in Leader equals ‘a’, ‘c’, ‘d’ OR ‘m’.
008/18{LDR/6=\a}{LDR/7=\a|LDR/7=\c|LDR/7=\d|LDR/7=\m}
Checking dependencies via string comparison and content comparison
Example data:
100 1#6880 − 01aZilbershtain, Yitshak ben David Yosef.
880 1#6100 − 01/(2/ra, יצחק יוסף בן דוד.
Reference data content of subfield ‘a’ of field ‘880’, if data content of subfield ‘6’ of field ‘100’ includes the string ‘-01’ (characters with index range 3-5 of field ‘800’) and the string ‘880’.
880$a{100$6~$6/3-5}{100$6~\880}
When used as a subTerm, fieldSpec, subfieldSpec and indcatorSpec can be abbreviated.
abbreviation = abrFieldSpec / abrSubfieldSpec / abrIndicatorSpec
See also SubSpec abbreviation rules.
An abbreviated fieldSpec is one of
abrFieldSpec = index [characterSpec] / characterSpec
Reference to third character of second field ‘007’, if first character of of second field ‘007’ equals ‘v’.
007[1]/3{/0=\v}
same as
007[1]/3{007[1]/0=\v}
An abbreviated subfieldSpec is one of
abrSubfieldSpec = (subfieldCode / subfieldCodeRange) [index] [characterSpec]
Reference to data content of subfield ‘c’ of field 020, if subfield ‘a’ of field ‘020’ exists.
020$c{$a}
same as
020$c{020$a}
Reference of data content of subfield ‘a’ of field ‘245’, if last character of the preceding spec equals the comparisonString ‘/’.
245$a{/#=\/}
same as
245$a{245$a/#=\/}
An abbreviated indicatorSpec is one of
abrIndicatorSpec = [index] "^" ("1" / "2")
Reference to data of the first field ‘800’, having ‘1’ as value for indicator 2 and data content of subfield ‘a’ includes the comparisonString ‘Poe’.
800[0]{$a~\Poe}{^2=1}
same as
800[0]{800[0]$a~\Poe}{800[0]^2=1}
Using an abbreviated subfieldSpec without subfieldCode or subfieldCodeRange makes the subfieldCode or the subfieldCodeRange of the corresponding subfieldSpec implicit.
245$a{/0-2=\The}
same as
245$a{245$a/0-2=\The}
According to MARCspec interpretation a MARCspec without an explicitly given index is always an abbreviation of n references. The fololwing examples show how these specs are interpreted.
Example Data:
020 ##$a0394170660$qRandom House$c$4.95
020 ##$a0491001304
Reference to data content of subfield ‘q’ of field ‘020’ if subfield ‘c’ exists.
020$q{$c}
same as
020[0-#]$q[0-#]{$c[0-#]}
same as
020[0]$q[0]{?020[0]$c[0]} OR // true
020[1]$q[0]{?020[1]$c[0]} // false
Example Data:
020 ##$a0394170660$qRandom House$qpaperback$c$4.95
020 ##$a0394502884$qRandom House$qhardcover$c$12.50
Reference to data content of subfield ‘c’ if data content of one repetition of subfield ‘q’ equals the comparison string ‘paperback’.
020$c{$q=\paperback}
same as
020[0-#]$c[0-#]{$q[0-#]=\paperback}
same as
020[0]$c[0]{020[0]$q[0]=\paperback} OR // false
020[0]$c[0]{020[0]$q[1]=\paperback} OR // true
020[1]$c[0]{020[1]$q[0]=\paperback} OR // false
020[1]$c[0]{020[1]$q[1]=\paperback} // false
This section is normative.
Because of the limited expressivity of the MARCspec, there must be some kind of implicit interpretation.
0
and the ending index #
.#
is always a reference to the last character in the data content.#
is used for the character starting position, the character indices MUST be interpreted backwards (like character ending position 0
for the last character, 1
for the last but one character, 2
for the last but two characters etc.).?
, the operator can also be omitted.The following table shows how SubSpec abbreviation MUST be interpreted.
corresponding spec type | corresponding spec end with | abbreviated spec begins with | interpretation | example |
---|---|---|---|---|
fieldSpec | index | index | valid fieldSpec with index | ...[2]{[1]} => ...[2]{...[1]} |
fieldSpec | index | characterSpec | valid fieldSpec with index and characterSpec | ...[1]{/0-3} => ...[1]{...[1]/0-3} |
fieldSpec | index | indicatorSpec | valid indicatorSpec with index | ...[1]{^1} => ...[1]{...[1]^1} |
fieldSpec | characterSpec | index | valid fieldSpec with index | .../0-7{[0]} => .../0-7{005[0]} |
fieldSpec | characterSpec | characterSpec | valid fieldSpec with characterSpec | .../0-7{/0=\2} => .../0-7{.../0} |
fieldSpec | characterSpec | indicatorSpec | invalid indicatorSpec since characterSpec denotes a fixedField, which can’t be used with indicators |
.../0-7{^1} => invalid |
subfieldSpec | index | index | valid subfieldSpec with index | ...$a[0]{[1]} => ...$a[0]{...$a[1]} |
subfieldSpec | index | characterSpec | valid subfieldSpec with index and characterSpec | ...$a[0]{/0} => ...$a[0]{...$a[0]/0} |
subfieldSpec | characterSpec | index | valid subfieldSpec with index | ...$a/1{[1]} => ...$a/1{...$a[1]} ...$a/1{[1]/1} => ...$a/1{...$a[1]/1} |
subfieldSpec | characterSpec | characterSpec | valid subfieldSpec with characterSpec | ...$a/1{/0} => ...$a/1{...$a/0} |
subfieldSpec | subfieldCode | indicator position | valid indicatorSpec | ...$a{^2=\0} => ...$a{...^2=\0} |
indicatorSpec | indicator position | indicator position | valid indicatorSpec | ...^1{^1!=\_} => ...^1{...^1!=\_} ...^2{^1!=\_} => ...^2{...^1!=\_} |
indicatorSpec | indicator position | index | valid fieldSpec, subfieldSpec or indicatorSpec with index | ...^2{[1]} => ...^2{...[1]} ...^2{[1]$a} => ...^2{...[1]$a} ...^2{[1]^1=\_} => ...^2{...[1]^1=\_} |
indicatorSpec | indicator position | characterSpec | invalid indicatorSpec; characterSpec is not applicable to indicator values | ...^2{/0=\1} => invalid |
SubSpecs get validated by the following rules:
A subSpec is true, if
=
one of the referenced values of the left hand subTerm is equal to one of the referenced values of the right hand subTerm.!=
none of the referenced values of the left hand subTerm is equal to one of the referenced values of the right hand subTerm.~
one of the referenced values of the left hand subTerm includes one of the referenced values of the right hand subTerm.!~
none of the referenced values of the left hand subTerm includes one of the referenced values of the right hand subTerm.?
by the right hand subTerm referenced data exists.!
by the right hand subTerm referenced data not exists.A subSpec is false, if
=
none of the referenced values of the left hand subTerm is equal to one of the referenced values of the right hand subTerm.!=
one of the referenced values of the left hand subTerm is equal to one of the referenced values of the right hand subTerm.~
none of the referenced values of the left hand subTerm includes one of the referenced values of the right hand subTerm.!~
one of the referenced values of the left hand subTerm includes one of the referenced values of the right hand subTerm.?
by the right hand subTerm referenced data not exists.!
by the right hand subTerm referenced data exists.operator | right is null | left equals right | right is subpart of left | left is subpart of right | other |
---|---|---|---|---|---|
= | false | true | false | false | false |
!= | true | false | true | true | true |
~ | false | true | true | false | false |
!~ | true | false | false | true | true |
? | false | true | true | true | true |
! | true | false | false | false | false |