Crystal Impact Home  
  About version 5 New functions Feature tour Download  

Diamond Version 5 User Manual: Structure files

Access to COD ("Crystallography Open Database")

This article describes how to access the crystal structure database COD ("Crystallography Open Database").
- COD includes (amongst others) AMCSD ("American Mineralogist Crystal Structure Database") as well as CIF files from the IUCr journals.
- How to setup the COD for Diamond version 5 -- not more available through installation package.
- How to search the database through "File/Search..." command.

Previous article: Inserting structure data into a document
Next article: Searching of Files


Contents of COD (Crystallography Open Database)

The crystal structure database COD ("Crystallography Open Database") including (amongst others) AMCSD ("American Mineralogist Crystal Structure Database") as well as CIF files from the IUCr journals, has been setup for direct access by Diamond. In 2010, this database of inorganic as well as organic crystal structures has crossed the 100,000 entry border. Now (August 2023) there are more than 505,000 datasets.

The COD version delivered with version 5.0.x of Diamond is dated April 15, 2021, and contains approximately 471,000 entries. The COD has been converted for Diamond into a proprietary file format, there are 27 "*.diambase" files for the several sub-files of COD, for instance "9.diambase" contains entries from AMCSD, "43.diambase" entries from "Inorg. Chem.". The COD version for Diamond will be updated regularly for the following versions 5.1.x etc.


Setup of COD

Since Diamond version 5.0 the COD is not more part of the Diamond installation package, because the size of the diambase files has grown from 2.3 GB (COD-2012-04-03 with 156,000 entries, delivered with Diamond version 4.x) to 8.4 GB (COD-2021-04-15 with 471,000 entries).

After the installation of Diamond version 5 there is a "COD" sub-folder in the Diamond program directory but this is nearly empty, just contains a Readme file. You first have to download the database files (packed in a single ZIP file) and to copy into that "COD" sub-folder. If you place the diambase files there before your first access to the COD database, everything will be OK.

Alternatively you can use a different folder and specify the path to the "diambase" files in the File Search Options dialog, available from the Search Files dialog, which is opened with the File -> Search command.

Downloading database files
To download the Diamond-specific version of the COD, visit our Diamond 5 download page: http://www.crystalimpact.de/diamond/v5download.htm. The "diambase" files have been compressed into a single ZIP file. Click on the download link mentioned on the Diamond 5 download page. This will start downloading the ZIP file into the default folder "Downloads" on your computer. (Alternatively, you can right-click the link and choose the "Save link as" command to choose a different folder.)

COD installation in default directory
The Diamond 5 installer creates a "COD" sub-folder in your Diamond program directory (which is by default "C:\Program Files\Diamond 5" or "C:\Program Files (x86)\Diamond 5"). This contains only a "Readme.txt" file with a short installation remark and a hint to this page. You should place the DIAMBASE files into this "COD" directory. (Note: You need administrator rights to put these files into a folder or sub-folder of the Windows programs directory.)

  1. Extract the DIAMBASE files from the downloaded ZIP file into a temporary directory.
  2. Move the DIAMBASE files to the "COD" sub-folder of the Diamond 5 program directory.
  3. Diamond automatically recognizes if the DIAMBASE files are completely in the "COD" sub-folder of the Diamond 5 program directory when you start a search with the File/Search dialog.

COD installation in a different directory
If you do not want to place the DIAMBASE files into the "COD" sub-folder of the Diamond program directory or have no administrator rights, you can place the files into a different folder, e.g. "C:\Users\YourName\Documents\COD".  (The target folder can have any other name than "COD".)

  1. Extract the DIAMBASE files from the downloaded ZIP file into the target folder.
  2. Tell Diamond where to find the DIAMBASE files:
    a) Run the command File -> Search. (This first throws a warning message that the COD files have not been installed or the target folder has not yet been set.)
    b) After closing the warning message, the Search Files dialog comes up, where you click on the Search criteria button.
    c) In the File Search Options dialog click on the Database path button to choose the folder containing the DIAMBASE files.

 


Searching the COD

To search the COD for matching compounds, run the command Search from the File menu, which opens the Search Files dialog. This dialog window comes up with an empty list of matching COD entries when you run it the first time during a Diamond session. You have to define search criteria through the Search criteria button and then launch the search with the Start search button.

The following screenshot shows the result of a search for "Dinnebier" in the author fields, with 198 (of 471,693) entries matching in 17 (of 27) database sub-files. The results are sorted in descending order for publication year.

Screenshot of 'Search files' dialog with result of searching for author='Dinnebier' in COD version of April 15, 2021, containing 471,000 entries
Screenshot of 'Search files' dialog with result of searching for author='Dinnebier' in COD version of April 15, 2021, containing 471,000 entries 

To define where and what to search, open the "File Search Options" dialog by clicking on the "Search Criteria" button. This dialog consists of four pages (tabs):
(1) Location: Where to search for (database and/or diamdoc, CIF etc. files).
(2) Restraints: Chemical and crystallographic search fields, from elements over space group to cell parameters.
(3) References: Search fields referring to publication as well as entry codes and dates.
(4) Find text: Search for text (fragments) in selected fields.

Location
On this page you decide, if to make the search (whose criteria will be defined on the susbsequent three pages) in all or selected database files and/or in files on your hard disk. (The searching of files will be treated in the article "Improved Searching of Files".) In our sample search for author="Dinnebier", we searched all 27 database sub-files. Set the checkmark at "Search database (sub-) file(s)" and ensure that all checkmarks are set in the sub-file list.

Screenshot of (database and file) Location page of File Search Options dialog  

Restraints

Use this page to define restraints concerning chemical composition (elements, formula), space group, and/or cell parameters:

Elements (mandatory or optional)
Define one or more symbols of elements that must be present in your database entries, separated with '+' signs each. So "Na+Cl+O" results in entries containing sodium and chlorine and oxygene but may have additional elements unless you make additional restraints, cf. "Element count". Optional elements are separated with comma. So "Na,Cl,O" results in all entries that have at least sodium or chlorine or oxygene (or two or all three of them), which is a lot, of course. You may use brackets (but only one level), for instance "(Na,K,Rb,Cs)+Cl" to find entries that contain Cl together with Na or Cl together with K or Cl together with Rb or Cl together with Cs.

Forbidden elements
List one or more elements (comma separated) that must not appear in the entries you are interested in.

Element count(s) or range(s)
Specify a number or range or multiple numbers, separated with commas, specifying the number of different elements to occur in your entries. This is mostly used in combination with the element search. For instance, if you define an element count of 3 and mandatory elements "Na+Cl+O", it will find only ternary compounds containing Na,Cl,O, whereas for an element count of 4 or 5, there are one or two, rsp., additional elements in your entries each. A range is given with two numbers separated with two dots each, e.g. "3..4", or separated with one minus sign, e.g. "3-4". You can use "4.." to specify an element count of at least four, for instance.

Formula type (ANX)
The search with formula type (or "ANX formula") is precarious, since it requires oxidation numbers to be defined in the entry's atomic parameter list. (That means in the COD CIF entries usually _atom_type_xxx loops where oxidation numbers/charges are given.) Besides this, you define one or more formula types, separated with commas. 'A', 'B' etc. define cations, whereas 'X', 'Y' etc. define anions. 'M', 'N' etc. are neutral. For instance "ABX4" finds the sodium perchlorates (provided that mandatory elements Na+Cl+O are defined above - and provided that Na and Cl are defined with positive oxidation number and O with negative oxidation number in the entries each).

Formula sum
Define a chemical formula, or multiple formulas separated with commas, for instance "NaClO3, NaClO4" finds both sodium chlorate or sodium perchlorate.

Name
Define a text fragment that must occur in one of the chemical names. You can use multiple names, separated with comma that may occur in your entries. For instance "chlorate, chloride" finds entries with "chlorate" or "chloride" in their names. If you specify multiple name fragments with '+' all these fragments must occur in a name together. You can use '-' to exclude fragments, that means "forbidden names". For instance, "chlorate -perchlorate" finds entries with "chlorate" in one of the names but excludes entries containins "perchlorate" in one of their names. (Note that the blank before the '-' is mandatory here. Otherwise Diamond would misinterpret names such as "1,2-dimethyl-something" etc.)
You can use '*' for text beginning or ending with a fragment. For instance "sodium*" matches names beginning with "sodium" whereas "*hydrate" matches names ending with "hydrate".
Diamond searches the following name fields:
(1) the systematic name (_chemical_name_systematic)
(2) the common name (_chemical_name_common)
(3) the mineral name (_chemical_name_mineral)
(4) the name of the structure type (_chemical_name_structure_type).

Space group/Int. Tables number(s) or symbol(s)
Define an International Tables space group number, or multiple numbers, separared with commas or ranges. For instance, "1-15,100-200,225.." allows all space groups from 1 through 15 as well as those from 100 through 200 as well as 225 or higher (which means through 230).

Cell parameters
Regardless if you search for one of the six cell parameters a, b, c, alpha, beta or gamma, or for the cell volume, use a range or a number of ranges, e.g. "100..110, 130..140" for cell volume.

Combination of restraints
If you define restraints (search items) in more than only one field, these search items will be AND-connected during the search process. For instance, if you define "Na+Cl+O" in the Elements input field and space group number range 1..15, Diamond will only find compounds containing Na and Cl and O (and since there is no restraint on the element count optional a fourth, fifth, etc. element) -- but only in one of the space groups with Int. Tables space group numbers 1 through 15. This also refers to restraints made on the other page ("References"). So if you specify a range for publication year, only entries published in years matching this range, will find their way into the search result list.

Examples of combined restraints
Before we continue with the second search criteria page, "References", we make some search excersises, which means we define some restraints and start a search each. Every time you have defined your restraints and closed the "File Search Options" dialog, click on the "Start search" button in the "Search files" dialog to update your results list.

Please note: The following examples were made with an older COD version of April 2012 with just 156,000 entries.

(1) The following restraints for ternary compounds containing both Na and Cl yields into 185 matching entries:
Elements (mandatory or optional): Na+Cl
Element count(s) or range(s): 3


Screenshot of search for ternary compounds (element count 3) and mandatory elements Na and Cl. There are sodium chlorate (V) and perchlorate (VII) as well as a Na3OCl, besides a lot of entries about (Na,K)Cl compounds.

(2) We omit the compounds containing O and will have 174 entries left:
Elements (mandatory or optional): Na+Cl
Forbidden elements: O
Element count(s) or range(s): 3

(3) Cross-check, we move the O from the forbidden to the mandatory elements and will have 11 matching entries:
Elements (mandatory or optional): Na+Cl+O
Element count(s) or range(s): 3

We now have the 11 chlorates and perchlorates of Na as well as the Na3OCl.

(4) Expand the restraints to K, Rb, and Cs:
Elements (mandatory or optional): (Na,K,Rb,Cs)+Cl+O
Element count(s) or range(s): 3

We now have 19 matching entries, since there are other alkali chlorates or perchlorates as well as a rubidium chloride oxide for instance:

Viewing some or all of the resulting entries
To view one or more entries of the search result list, set their corresponding checkmarks in the first column. (You can use the "Select all" button to mark all.) and click on "Open" to create a Diamond document. The document opened from the search results will have a title containing date and time of your search. The "Search files" dialog is modeless, which means it keeps open while you can continue working in Diamond with results. But it is most likely that you close the dialog to save space on your screen. The search result list will re-appear the next time you re-open the dialog (with "File"/"Search..."). (Alternatively you can click on the minimize button in the right part of the dialog's title bar.) When you close the Diamond application and later start a new Diamond session,, the next time you run "File/Search" will show up with the latest restraints but you must run a "Start search" again to build up the results list.

References

The section "Bibliography" offers the following searchable fields:

Author(s):
Define the name or name fragment of an author you are searching for. You can specify multiple names, separated with semicolons (';' -- not commas!), for instance "Jansen; Korber" finds entries where either "Jansen" or "Korber" is mentioned (or even both authors). Use '+' sign to force both names to come in an entry each, that means "Jansen +Korber" finds entries contributed by Jansen and Korber. Use the '-' sign to exclude a name, for instance "Jansen -Korber" excludes the entries with (co-)author Jansen where Korber is (co-)author.. But don't miss the blank before the '-'. Otherwise, Diamond would search for a double name "Jansen-Korber" like "Müller-Buschbaum".
You can include the prename, too, but this is a precarious task, because the naming in the several CIF entries from different journals use deviating naming conventions concerning the order of prename and surname and if and how to abbreviate the prename. For instance, "Jansen, M." should find M(artin) Jansen but not e.g. E. Jansen, J. Jansen, and G. Jansen. This also works for combinations of names, e.g. "Jansen, M. +Korber, N.".

Journal name(s) or coden(s):
Define one or more fragments of journal titles. For instance "Kristall." finds both "Kristallografiya" and "Kristall und Technik". "Z. Krist." or "Z Kristallogr" (with or without the dot) finds "Zeitschrift für Kristallographie". Or use journal codens, e.g. "ACCRA9" for "Acta Crystallographica, Section C". Use ';' sign to separate when using multiple journal names or codens, e.g. "Z. Krist.; J. appl. Cryst.".

Publication year(s) or range(s):
Define a (4-digit) year, or multiple years, separared with commas or ranges. For instance, "1970..1980,2010.." allows all publications of the years 1970 through 1980 as well as from 2010 until the latest/newest.

Publication(s):
Define one or more (separated with semicolons) journal names or codens, optionally followed (and separated with comma) with volume number and (optionally again) page number. For instance, "Z. Krist.,188;Z. Krist.,189" finds both volumes 188 and 189 for "Zeitschrift für Kristallographie".

In the section "Codes" you can search for entry codes as well as for recording dates or updates.

Entry number(s) or code(s):
Define an entry number or code (which is numeric in COD), or multiple entry codes or ranges of numbers, e.g. "1000000..1000999;8100000..8100999".

Recording date and Update:
Define two delimiting years or two dates ("yyyy-mm-dd") in the recording date or update fields, rsp. to restraint on these items.

Find text
It is possible to search database files sequentially for a text (fragment), which is a lengthy process when applied to whole COD database! Since search for text is most likely an application on selected files in selected parts of your hard disk, this is described in the article "Searching of Files".
If you want to search for a text (fragment) in COD, you should define suitable restraints on the "Restraints" and/or "References" pages, so that Diamond needs to search only the COD entries matching these restraints -- and not the whole COD, which could be a lengthy process lasting several minutes!


Previous article: Inserting structure data into a document
Next article: Searching of Files