xmlcoll package

A package of python routines to work with data in XML format of samples.

xmlcoll.base module

Module providing base property functions.

class xmlcoll.base.Properties[source]

Bases: object

A class for storing and retrieving properties.

get_properties()[source]

Method to retrieve the properties.

Returns:

dict: The dictionary of current properties.

update_properties(properties)[source]

Method to update the properties.

Args:

properties (dict): A dictionary of properties. New properties are added. Old properties are updated. The keys for the dictionary entries are name and up to five optional tags labeled tag1, tag2, …, tag5.

Returns:

On successful return, the properties have been updated.

xmlcoll.coll module

Module for XML collections of items.

class xmlcoll.coll.Collection(items=None)[source]

Bases: Properties

A class for storing and retrieving data about data items.

Args:

items (list, optional): A list of individual xmlcoll.coll.Item objects.

add_item(item)[source]

Method to add a item to a collection.

Args:

item (xmlcoll.coll.Item) The item to be added.

Return:

On successful return, the item has been added.

get()[source]

Method to retrieve the item collection as a dictionary.

Returns:

dict: A dictionary of the items.

get_dataframe(index_label='name', tag_delimiter='_')[source]

Method to retrieve the collection data as a pandas dataframe.

Args:

index_label (str, optional): Index label for the dataframe.

tag_delimiter (str, optional): Delimiter used to separate tags in combined column names.

Returns:

pandas.DataFrame: A pandas dataframe containing the collection data. Columns are labeled by a string formed by concatenating property names and tags separated by the chosen delimiter.

remove_item(item)[source]

Method to remove an item from a item collection.

Args:

item (xmlcoll.coll.Item) The item to be removed.

Return:

On successful return, the item has been removed.

update_from_dataframe(data_frame, index_label='name', tag_delimiter='_')[source]

Method to update collection data from a pandas dataframe.

Args:

data_frame (pandas.DataFrame): The pandas dataframe.

index_label (str, optional): Index label for the data frame.

tag_delimiter (str, optional): Delimiter used to separate tags in combined column names.

Returns:

On successful return, the collection has been updated with the data in the data frame.

update_from_xml(file, xpath='')[source]

Method to update a item collection from an XML file.

Args:
file (str) The name of the XML file from which to

update.

xpath (str, optional): XPath expression to select items. Defaults to all items.

Returns:

On successful return, the item collection has been updated.

validate(file)[source]

Method to validate a collection XML file.

Args:

file (str) The name of the XML file to validate.

Returns:

An error message if invalid and nothing if valid.

write_to_xml(file, pretty_print=True)[source]

Method to write the collection to XML.

Args:

file (str) The output file name.

pretty_print (bool, optional): If set to True, routine outputs the xml in nice indented format.

Return:

On successful return, the item collection data have been written to the XML output file.

class xmlcoll.coll.Item(name, properties=None)[source]

Bases: Properties

A class for storing and retrieving data about a data item.

Args:

properties (dict, optional): A dictionary of properties.

get_name()[source]

Method to retrieve name of item.

Return:

str: The name of the item.