Skip to content

Step 4: Managing Metadata

Add Metadata to a Data Object

Add to file iris.data the following metadata elements:

key Value
description Iris dataset containing measurements of iris flowers
source UCI Machine Learning Repository
Add metadata
imeta add -d /WCDSacc/courses/11032025/your_username/iris_data/iris.data description "Iris dataset containing measurements of iris flowers"
imeta add -d /WCDSacc/courses/11032025/your_username/iris_data/iris.data source "UCI Machine Learning Repository"
  1. Select the file iris.data
  2. Add the metadata key: 'description' and value: "Iris dataset containing measurements of iris flowers" and click the button 'Add'
  3. Add the metadata key: 'source' and value: "UCI Machine Learning Repository" and click the button 'Add'

Add metadata

gocmd addmeta /WCDSacc/courses/11032025/your_username/iris_data/iris.data "description" "Iris dataset containing measurements of iris flowers"
gocmd addmeta /WCDSacc/courses/11032025/your_username/iris_data/iris.data "source" "UCI Machine Learning Repository"
from connect import connect_to_irods
from irods import exception
import irods.meta

session = connect_to_irods()


def add_meta(data_object, attribute, value=' ', units=' '):
    try:
        data_object.metadata.add(attribute, value, units)
        print('Metadata added.')
    except irods.exception.CAT_SQL_ERR:
        print('Metadata already exists.')
    except Exception as e:
        print(f'Something went wrong: {e}')


# Example
var = session.data_objects.get('/WCDSacc/courses/11032025/your_username/iris_data/iris.data')
add_meta(var, 'description', 'Iris dataset containing measurements of iris flowers')
...

List Metadata of a Data Object

List the metadata of the data file to verify it has been added.

List metadata
imeta ls -d /WCDSacc/courses/11032025/your_username/iris_data/iris.data

You should see an output similar to:

1
2
3
4
5
6
7
8
AVUs defined for dataObj /WCDSacc/courses/11032025/your_username/iris_data/iris.data:
attribute: description
value: Iris dataset containing measurements of iris flowers
units: 
----
attribute: source
value: UCI Machine Learning Repository
units: 

It should look like this now:

Added metadata

gocmd lsmeta /WCDSacc/courses/11032025/your_username/iris_data/iris.data

You should see an output similar to:

286617  "description"   "Iris dataset containing measurements of iris flowers"  <empty units>
286618  "source"        "UCI Machine Learning Repository"       <empty units>
...
def list_meta(data_object):
    try:
        for x in data_object.metadata.items():
            print(f"    Attribute: {x.name or ' '} ")
            print(f"    Value: {x.value or ' '}")
            print(f"    Units: {x.units or ' '}")
            print("----")
    except (irods.exception.CollectionDoesNotExist,
            irods.exception.DataObjectDoesNotExist,
            irods.exception.NoResultFound):
        print('Folder or file does not exist.')


# Example
list_meta('/WCDSacc/courses/11032025/your_username/iris_data/iris.data')

Metadata considerations

When using metadata, consider putting prefixes in your metadatanames.

Examples:

  • WUR_RDM:dataset:archive_status
  • or research_group_name:department:project_name:dataset:name
  • etc.