The filerecords API

filerecords is a python package for reading and writing file records in the terminal. However, it also comes with a set of classes that can be used from within python scripts directly.

The basic File Record

FileRecord is the basic object in which records within a registry are accessed. This object can access and edit records for a specific file within the registry.

API Usage

In the code API, a Registry will return a FileRecord object for a specific file when a record is accessed. Each FileRecord requires a Registry object as parent to be initialized, and can then either access the registry records via the internal id (happens when records are accessed via the Registry object) or via the file path which is then searched for in the registry. It is suggested to access the records directly via the Registry object only.

Once a FileRecord has been obtained from the registry its comments and flags can be accessed using the comments and flags attributes.

from filerecords.api import Registry

reg = Registry()

# Access a record via the registry
record = reg.get_record( "path/to/file" )

# Access the comments
comments = record.comments

# Access the flags
flags = record.flags

The FileRecord object can also be used to add new comments and flags to the record.

record.add_comment( "the new comment" )
record.add_flag( ["flag1", "flag2"] )

When adding comments or flags it is important to save the record afterward. This will automatically also save the parent registry.

# Save the record (will also save the registry)
record.save()
class filerecords.api.file_record.FileRecord(registry, id: Optional[str] = None, filename: Optional[str] = None)[source]

Bases: BaseRecord

This class represents a single file record entry.

Parameters
  • registry (Registry) – The registry the file record is associated with.

  • id (str) – The unique identifier of the file record. If none is provided, a new id is created.

  • filename (str) – The filename of the file to record (if a new record is being created).

add_flags(flags: str)[source]

Add flags to the metadata.

Note

This will not automatically save the metadata, use the save() method to do so.

Parameters

flags (str or list) – The flag(s) to add. This can also be a defined flag-group label.

load()[source]

Loads the file records.

Note

This is done automatically during init if an existing file is specified.

save()[source]

Save the file record. This will also save the the registry state at the same time.

to_markdown(comments_header: bool = True)[source]

Convert the metadata to a markdown representation.

Parameters

comments_header (bool) – Add a header above the comments.

to_yaml(timestamp: bool = False, filename: Optional[str] = None)[source]

Convert the source registry to a single YAML file.

Parameters
  • timestamp (bool) – Add a timestamp in the markdown.

  • filename (str (optional)) – The filename of the yaml file to create. If none is provided, no file is created.

Returns

The assembled dictionary of the registry.

Return type

dict

The main Registry

Registry is the main class handling the filerecords registry. The registry itself is a hidden directory containing the records of files and directories in yaml format.

API Usage

In the code API, a registry can be accessed by setting up a Registry object, which will automatically locate the closest available registry, but can also initialize a new registry.

from filerecords.api import Registry

# Call a registry from the current directory.
reg = Registry()

Initializing new registries

The above code will try to find a registry in any directory that higher in the directory tree than the current directory. If none is found, a new registry is automatically initialized in the current directory.

In case a registry is found, the registry can be accessed now. However, a new registry can still be initialized in the current directory by calling reg.init().

# Initialize a new registry in the current directory
# (even if a registry was found in a parent directory).
reg.init()

Accessing the registry

The registry’s own metadata can be accessed via the comments and flags attributes directly.

# Get the registry's comments.
reg.comments

# Get the registry's flags.
reg.flags

# Get the defined flag groups.
reg.groups

New comments and flags can be added to the registry itself using the add_comment() and add_flags() methods.

# Add a new comment.
reg.add_comment( "This is a new comment." )

# Add a new flag.
reg.add_flag( "new_flag" )

After editing the comments or flags of a registry, the changes must be saved using the save() method.

# Save the registry.
reg.save()

To access records from the registry the get_record() method can be used.

# Get a record from the registry.
record = reg.get_record( "path/to/file" )

# e.g.
record = reg.get_record( "results/gsea_20082022.tsv" )

This will return a FileRecord object which now allows to access the metadata of the file.

Alternatively, if the precise filename is not known or a number of files shall be found, the search() method can be used to find records based on their flags or on regex patterns in their filenames.

# Search for records with the "important" flag.
records = reg.search( flag = "important" )

# Search for records with the "important" flag and which are bam files.
records = reg.search( flag = "important", pattern = ".*\.bam" )

Note

Search always applies and logic to to the pattern and flag. Only a single flag is supported for searching. To include multiple flags in a search, first define a flag group and then search for the group’s label flag.

For instance, assume we wish to search for flags “important” and “results” at the same time. We first create a group (see below) containing these flags and then search for the group. Then we need to flag our files of interest with the group. This is not an ideal system yet and may be updated in the future.

Flags and flag groups

Flags are used to mark files in the registry. They can be used to mark files as important, as results, as temporary files, etc. They are a shorthand for the details in the comments. Often it is desirable to add multiple flags at the same time to a file. To avoid having to repeatedly add multiple flags at the same time, registries support flag groups which summarize multiple flags by one label.

# Add a new flag group
reg.add_group( "supergroup", ["important", "results"] )

# now add a new file and flag it as member of the supergroup
reg.add( "my_superfile.txt", comment = "superfile", flags = "supergroup" )

# search for all files flagged as supergroup
records = reg.search( flag = "group:supergroup" )

Adding and editing records

New files can be added to the registry using the add() method. If a file is already recorded, the update() method must be used instead.

# add a "results/" directory to the registry
reg.add( "results/", comment = "the main results", flags = ["results", "important"] )

# to now add additional comments or flags to the "results/" directory use update instead
reg.update( "results/", comment = "an additional comment about the results", flags = ["perhaps_another_flag"] )

Records (i.e. files or directories) can be moved and/or removed from the registry. By default these actions also affect the files in the filesystem. I.e. a file is by default also deleted if it is removed from the registry - this can be controlled, however.

# Move a file to a new location.
record.move( "/path/to/file", "new/path/to/file" )

# Only adjust the references in the registry
# (maybe because you already moved the file and forgot to update the registry).
record.move( "/path/to/file", "new/path/to/file", keep_file = True )

# Remove a record from the registry.
record.remove( "/path/to/file" )

# Only remove the references in the registry but keep the file in the filesystem.
record.remove( "/path/to/file", keep_file = True )

Exporting the contents of the registry

The registry contents can be represented in markdown format using the to_markdown() method, or in yaml format using the to_yaml() method.

# Export the registry to a markdown file.
reg.to_markdown( timestamp = True, filename = "registry.md" )

# Export the registry to a yaml file.
reg.to_yaml( timestamp = True, filename = "registry.yaml" )
class filerecords.api.registry.Registry(directory: str = '.')[source]

Bases: BaseRecord

The main class of a filrecords registry. It loads the registry information from a parent directory and makes the data accessible.

Note

This class will search for existing registries automatically in the filesystem and initialize a new registry if none are found. However, even if one is found a new registry can be initialized in the current directory by calling init().

Parameters

directory (str) – The directory to load the registry from. By default the current working directory is used.

add(filename: str, comment: Optional[str] = None, flags: Optional[list] = None)[source]

Add a new file to the registry.

Parameters
  • filename (str) – The filename of the file to add.

  • comment (str) – The comment to add to the file.

  • flags (list) – Any flags to add. This can also be a defined flag-group label.

add_group(label: str, flags: list)[source]

Add a flag group to the registry.

Note

This will not automatically save the registry’s metadata, use the save() method to do so.

Parameters
  • label (str) – The label of the group.

  • flags (list) – The flags of the group.

base_has_registry()[source]

Checks if the current directory already has a registry.

get_record(filename: str)[source]

Get the record of a file in the registry.

Parameters

filename (str) – The filename of the file to get the record of.

Returns

The record of the file or a list of records.

Return type

FileRecord or list

property groups: dict

Get the defined flag-groups.

init(permissions: Optional[int] = None)[source]

Initialize a new registry in the given directory.

Parameters

permissions (int or str) – The permissions to use for the registry directory. By default the permissions of the parent directory are used.

move(current: str, new: str, keep_file: bool = False)[source]

Move a file to a new location.

Parameters
  • current (str) – The filename of the file to move.

  • new (str) – The new filename to move the file to.

  • keep_file (bool) – If True only the path reference is adjusted within the registry. If False the file moving will also be performed.

remove(filename: str, keep_file: bool = False)[source]

Remove a file from the registry.

Parameters
  • filename (str) – The filename of the file to remove.

  • keep_file (bool) – If True, the file will not be removed from the filesystem, only its records in the registry.

save()[source]

Save the registry state and updated metadata.

search(pattern: Optional[str] = None, flag: Optional[str] = None)[source]

Search for records in the registry either through a filename pattern or by a flag.

Parameters
  • pattern (str) – The filename pattern to search for.

  • flag (str) – The flag to search for. Note, this can only be a single flag! To search for multiple flags, first define a flag-group and then search for the group label using group:yourgroup.

Returns

A list of FileRecord objects of record entries matching the search criteria.

Return type

list

to_markdown(include_records: bool = True, timestamp: bool = False, filename: Optional[str] = None)[source]

Convert the metadata to a markdown representation.

Parameters
  • include_records (bool) – Include the records in the markdown.

  • timestamp (bool) – Add a timestamp in the markdown.

  • filename (str (optional)) – The filename of the markdown file to create. If none is provided, no file is created.

Returns

The markdown representation of the registry.

Return type

str

to_yaml(include_records: bool = True, timestamp: bool = False, filename: Optional[str] = None)[source]

Convert the source registry to a single YAML file.

Parameters
  • include_records (bool) – Include the records in the markdown.

  • timestamp (bool) – Add a timestamp in the markdown.

  • filename (str (optional)) – The filename of the yaml file to create. If none is provided, no file is created.

Returns

The assembled dictionary of the registry.

Return type

dict

update(filename: str, comment: Optional[str] = None, flags: Optional[str] = None)[source]

Update an existing file record.

Parameters
  • filename (str) – The filename of the file to update.

  • comment (str) – The new comment to add to the file.

  • flags (str or list) – The new flags to add to the file.

Module contents

This is the core package of filerecords.