Purpose and Background:
Each resource or item in the archive needs to have accompanying information in Extensible Markup Language (XML) which can tell current and future users and distribution systems/archives as much as possible. To a human reader XML is less than felicitous. Editing XML manually is a time consuming process and the potential for errors is high. There are commercial software options that provide ways of streamlining the process, but were cost prohibitive for this project. Therefore, this XML generator was created to help. The system is a mix of PHP, HTML and CSS and is intended for portability and ease of modification. All of it, excluding this help page, It is contained in a single PHP file.
Process & Workflow:
- Input fields. Complete all of the fields that are relevant to the resource. Most likely you will never have to edit the Linguistic field, Subject or Type. These are set to default values. Just leave them as is unless you want to change them. Complete at least Title, Date and Description and then as many other fields that are relevant to the resource.
- Click Submit.
- After the XML is generated, select all of the text and copy it (right-click 'copy' or ctrl-c).
- Paste it into the Wordpress (the main archive site) Post field at the bottom of the entry.
Why is the generated XML font size so small?:
The text is especially small to make the process of selecting and copying easier and quicker. PHP, unfortunately, does not have a quick way of sending echoed text or text in a div to the clipboard. Thus, the selecting and copying must be done manually.
If you need to examine the generated codes, you can paste it into a texteditor or Wordpress.
Wordpress:
This builder also generates some tags that will allow a seamless transition to the current Wordpress archive. Copy the generated code and paste it in Wordpress and it will all be formatted automatically.
Audio Files in Wordpress:
If you select "Sound" for type and "FLAC" for format, the system assumes that there is an MP3 file with the same URN as the FLAC file in the same location. Put the full URL to the FLAC file in the Resource URL box and the system will find the MP3 in the same location and embed it for streaming in the Wordpress CMS.
Example:
If you put this URL to the FLAC in the URL blank:
http://depts.washington.edu/sahteach/wordpress/wp-content/uploads/2015/04/Recording_of_the_Language_101.FLAC
After you click 'submit' the builder will create a link to the FLAC for downloading and then also create the code for auto embedding of the MP3 in the code that looks like this:
[audio mp3="http://depts.washington.edu/sahteach/wordpress/wp-content/uploads/2015/04/Recording_of_the_Language_101.MP3"][/audio]
Input Field Background:
Most of the code that will be generated will never need to be altered (e.g., the Dublic Core and Olac declarations). The input fields in the generator represent the XML tags that are the most likely to be altered for the purposes of this archive and are arranged in the predicted frequency for which they will be edited (i.e., top=more, bottom=less). Any tags that are not represented with a field may always be edited manually after the generation process.
Input Field Descriptions:
Below are simplified descriptions. Please refer to Dublin Core and OLAC for more info.
- Title: The name of the resource.
- Date: The date the resource was created. If the resource was created in 1995 but incorporates another resource (say, a text from 1918), use the most recent creation date in this field and the enter the older date in the Additional Date field. Acceptable formats for the date can be found here: http://www.w3.org/TR/NOTE-datetime
- Description: Any additional info about the resource. For example, a sub-title might go here.
- Resource URL: The full address of the location where the file is stored, whether it be in the Wordpress system (or other CMS), another server, YouTube link, etc. The URL does include the full filename and extension.
- Filename (URN): This is the filename, or Uniform Resource Name used as the filename. The URN does not include a file extension (e.g., .mp3 or .pdf) and this input should not either.
- Type: The type of the resource physically. Is it a text (paper, book), sound (audio recording), etc.
- Linguistic Type: The type of the resource linguistically, primarily related to language documentation. See here for explanations of each of the options: http://www.language-archives.org/REC/type.html
- Format: The digital format of the resource. See for more info: http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements#format
- Table of Contents: If the resource has sub-units, such as chapters, they are listed here.
- Associated collection: If the resource has any related resources - either similar or containing collections, they should be listed here. Example: Homework 1.3 might have an associated collection of Homework 1 or Homework 1.2.
- Linguistic Field: This is the sub-field that is associated with the resource. Most likely it will be applied_linguistics. See here for more info: http://www.language-archives.org/REC/field.html
- Subject: The subject of the resource, most likely 'Teaching the Sahaptin/Yakama Language'
- Publisher: If no publisher exists, leave blank.
- Speaker name: For audio recordings and transcribed speech, enter the speaker's name here.
- Extra Depositor Name: If you are not one of the original workers on the project (Beavert, James, Hargus or Hugo) enter your full name here.
- Additional Date: see (Date) above.
- Restricted Access: Check this box if the resource has sensitive content and should be restricted such that only members of the Yakama Nation can access it. Checking this box does not secure the content. It only informs future archives that it should be. Any current archive systems will have to restrict the content in their own way.
Acknowledgements:
Thank you to Theodore Gerontakos at the UW Libraries for the help on this project and inspiration for this XML generator.