On separators et al.

Separators and other things - food for thought

This document has been prepared to narrow down various options the Committee will be discussing in its next phone meeting. It hits on several points simultaneously as separators are used with identifiers and it would be hard, we found, to consider one without thinking of the other. So the following goes into section 2.2.3, on separators, of course, but also touches section 3.2, the tribunal identifier, section 3.4, the paragraph numbers, section 4.3, on chambers, subdivisions and judicial districts, section 4.5, on various qualifiers of decisions and section 4.6, on references to notes.

On separators

So far, it’s been agreed that separators are useful to enhance readability, but that their number should also be kept to a minimum. We might also add that economy of use should be a concern. The questions of exactly which separators, where and consequently how many still remain.

A source of separators and their usage there

The characters and symbols available in our citation project are those of basic ASCII, the character set we’ve settled on because it’s still the only one understood by the vast majority of computer operating systems currently in use. Basic ASCII contains more than enough non alphanumeric characters and symbols to meet our needs. Conventional legal references, which are descriptive, use quite a few separators; all of them are contained in basic ASCII. However, we might not want to carry over in our neutral citation meanings some of these separators might have acquired in conventional citation syntax. Another source of inspiration for a choice of separators might be the Internet and the electronic addresses reference mechanisms it uses.

On the Internet, electronic mail addresses are composed of a name, or " nickname ", of a person separated from a composite domain name by the " @ " symbol, which is reserved for this specific use. Since computers are leafs on the branches of the Internet, so to speak, and their structures are a continuation of the Internet, as reflected in the construction of URLs – Uniform Resources Locators, we might as well consider all of this as a whole, as a name-space. In this name-space, domain and sub-domain names are separated by periods, while computer directories and sub-directories are separated by forward or backward slashes, and even colons on Macintoshes. There might be other directory separators. Directories separators are coloured with the notion of left to right hierarchical relations, while periods in domain names are with an inverse right to left notion of hierarchy. Still on computers, extensions to file names, used mainly to indicate a document format, are separated from the filename by a period. They can be considered as right side appendixes, suffixes. So we have:

Domain name: crdp.umontreal.ca

e-mail address : huard@crdp.umontreal.ca

URL : www.droit.umontreal.ca/citation/draft.rtf

In the above URL example, the period’s meaning as a separator is unambiguous due to context : we always know if it is in a domain or file name. Slashes are uniquely associated with directories hierarchy, so much in fact that they maybe cannot be used for something else.

The hyphen is allowed in domain and file names the same as a letter in a character string between two periods. Usually in such a case there is no notion of hierarchy. The hyphen is in fact used in Canada to simultaneously post both language versions of acronyms, as in "scc-csc" for the Supreme Court. The underline character "_" is also sometimes used as a space in cases where a system takes the space only as a separation between two character chains representing two different entities.

Our citation : when, where, how and which separators to use

Four situations where separators might be needed can be identified in the core of our citation project: (1) between elements of the core, these being the year element, the tribunal element and the ordinal number element; (2) between sub-elements of an element, namely of the tribunal identifier element, these sub-elements being a jurisdiction designator in the case of provinces, a possible subdivision of a complex tribunal and also, when needed, an optional language code. Then there is also (3) the separator before the optional paragraph number which is necessary to distinguish it from the decision ordinal number. Finally (4), at least the possibility of a revision or corrigendum of a decision calls for a separator to be inserted between the ordinal number and a suffix. Two criterions should inform our decisions to use separators: when and where. When is concerned with readability, it is by nature optional. Where is concerned with clarity. When clarity is not served well enough by the mere position of an identifier in the left to right ordering, a dedicated separator will confirm the identity of the identifier. This last case would answer the question of how. The question of which will be answered as we go.

(1) Between elements of the citation core, a separator is needed only for readability. It might as well then be a mere space. Also, readability would benefit form the underline character being used in place of a space when the citation is used as a filename, for many operating systems do not accept spaces in file names. But few people unfamiliar with the standard would see the documents’ filenames in any case. Would a separator still be needed there for machine processing? Probably not, since so far, although the length of some elements, namely the tribunal element, is not fixed, an alternation between digits and letters is maintained.

(2) Depending on the ordering of the sub-elements inside the tribunal identifier element, a separators might or might not be useful or essential between a pair of sub-elements. The use of separators inside the tribunal identifier element cannot be considered independently of its construction. For readability, this intra-element separator should be different from the separator used between elements. Clarity here will not only be served by use of a separator, but also by the positioning of the different sub-elements. The optional language codes "EN" or "FR", being considered as suffixes, might for example always come last in the tribunal element and be preceded by a separator. Of course this excludes the use of these pairs of letters as identifiers of subdivisions of tribunals, for two currently imprevisible optional occurrences, a language code and a subdivision, cannot be allowed to happen in the same place and have the same name. Precluding the use of the pairs of letters "EN" and "FR" to designate subdivisions of tribunals would have to be very clearly stated in the standard. For clarity, the tribunal identifier would always be separated on the right from a subdivision identifier or a language code by a separator. If both subdivision identifier and language code are present, they would be in that order, with the language code last at the right. Both would also always be used with their separator, marking them as suffixes.

An alternative would be to avoid all proximity of tribunal subdivision identifiers and language codes by relegating the latter to the left of the tribunal identifier. Since tribunal names and especially tribunal sub-division names are subject to evolve in an unpredictable way, thus at some point inevitably confronting the user with an unknown subdivision identifier, it might be preferable not to have the optional language code slated to appear where an unknown identifier might show up. Also, in a decentralised system where attribution of names of tribunals and their divisions isn’t controlled by a single instance, we might want to consider that a new subdivision name abbreviation might happen to be the same as seldom used language code. Furthermore, in that alternate position, language codes could do without a mandatory separator, as their neighbours would be a jurisdiction code to their right: two such well known standard codes as a province and language code are easily recognised without a separator between them. It wouldn’t really be a good idea to insert the optional language code between the jurisdiction and tribunal identifiers, as they should be considered as a unit. The only separator then needed in the tribunal element would precede a tribunal subdivision identifier. Thus its presence would unmistakably identify the letters following it on the right as a tribunal subdivision identifier, however obscure or novel. The use of a given separator for a single purpose confirms the identity of the identifier it particularises.

There might also be a hierarchy of sort in the use of separator: the most often used separator should be the simplest. Hence the space should be used optionally for readability between the three non optional elements of the core. Inside the tribunal element, clarity would be fostered by the period. Then the hyphen would be used in the paragraph pin-point when a string of paragraph is referred to.

It might be argued that doing away with the period used before the tribunal subdivision identifier would not diminish clarity. We say that while it may be true for someone very much familiar with the courts of a province, retaining this separator as non optional might still prove helpful to non-casual users. Casual users of course will need all the help they can get, a full complement of optional separators. All optional separators used for readability should be used simultaneously as a group : either they’re used or they’re not.

(3) The comma is already used in the ABA proposal in front of an optional paragraph number. It should thus not be used otherwise. Furthermore, if a special character or symbol is dedicated to indicating the optional paragraph number, its use makes the comma optional, and thus used only for readability. The symbol "@" below is used as a credible example. Such a choice will be further discussed in relation with section 3.4 of the working draft and the associated question 3.2.

We would thus have, for the Youth Divisions of the Court of Quebec :

1996 QCCQ.YD.EN 32, @44 where "@" is used for "paragraph", and

1996QCCQ.YD.EN32@44

1996 ENQCCQ.YD 32, @44 where "@" is used for "paragraph", and

1996ENQCCQ.YD32@44

The hyphen for a series of paragraph would be used thus:

1996 QCCQ.YD 32, @44-47

1996QCCQ.YD32@44-47

As the paragraph is indicated by a dedicated symbol, so could be the rare occurrence of a note being referred to. This question is addressed by section 4.6 and question 4.5 of the working draft. A number of possibilities are available, but using the letter "n" still seems the more intuitive choice:

1996 QCCQ.YD 32, n8 and

1996QCCQ.YD32n8

(4) Suffixes may also be added to the ordinal number to indicate a revision to a document, thus giving its own identifier to each document but retaining their kinship. A separator between the ordinal number and a single letter code at its right might be needed for readability, but should be optional, as the letter – digit alternation is retained. Here the period could also be used.

1996 QCCQ.YD 32.r, @44 or

1996QCCQ.YD32r@44

A this point, we should probably remember the citation’s role as a document identifier rather than description. Only codes identifying a different document – a different version is a different document – should be used. Such codes cannot be considered mere qualifiers, as they are used to identify a different document. All different types of documents should be given an identifier letter, and thus only one such code could used at a time. An adequate set of single letter codes will have to be devised and should be specified in the standard as an optional feature, "n" of course being already reserved to indicate a note. A second spot at this position could then be used for qualifiers when and if they are needed.

The question of qualifiers is the object of section 4.5 and question 4.4 of the working draft, where codes "U" for "unreported" and "R" for "revised" are proposed. The first, of course, is a qualifier, while the second is a descriptor. Other codes could be useful and retained in the standard, but they should always be distinguished as qualifiers or descriptors. Maybe "U" should be the only qualifier code mentioned in the standard, givent that it should probably be left to tribunals to decide what they will publish.. If some parties wish to further describe decisions with supplementary qualifiers in the position allowed above, the codes used should also be the object of a convention, not a part of this standard proper, and probably be inserted in the citation as suffixes of the ordinal number. Obviously, the letter codes specified as part of this standard couldn’t be used for other purposes and they should always come in first when supplementary qualifiers are used with them.

*****