loading x elements...

Before explore practical examples, let's explain briefly how Tags are used.

Tags are a very flexible way to represent arbitrary information.

They are used to describe objects, to define tasks, to mark the progress on tasks, and for many other usecases.

Rules are designed to work with Tags in an intuitive way. To understand how, let's look at an example.

An example

Each Tag consists of a Symbol, and optionally a weight (number) and a comment (string). A Tag may stand on its own, or it may have any number of arguments. Since Tags can have other Tags as arguments, you can build a complex DAG by combining Tags with each other.

The followng DAG is an excerpt of a real Tag structure that is used by the task defined through the Symbol task_data_cleansing_and_analysis_for_pandas:

This graph is zoomable. Each Node in it represents one object, and most of the objects are Tags. Some objects are replaced with grey placeholders to reduce the size of the graph. Tags with the same Symbol are drawn with the same color. Tags have tooltips.

Understanding the example graph

The root of this DAG is a Tag called modifiable_file. This Tag represents a file, such as an excel sheet uploaded by a user, and keeps track of any modifications to that file as well as any analyses performed on it.

Some of the Tags occur multiple times, and each new occurrence is an update that overwrites the last. For example, there are multiple current_file Tags attached to the central modifiable_file Tag. Each time a modification was performed on the file, a new current_file Tag was created to associate the new file with all of the existing information. Another Tag that works the same way, where new versions overwrite old ones, is info_column_types, which attaches to a column and describes what datatypes that column has.

There are also other Tags that do not overwrite each other, like for example the column Tags. Each of these represents one column of the file.

Both of these very different types of relationships can easily be modeled with Rules:

  • Say you are writing a program to visualize a column of a file.

    You need to know the type of the column because your analysis depends on that information.

    Rather than having to analyze the column in the file yourself, you can look at the existing Tags. This way you don't need to worry about edge cases yourself, because that's the responsibility of whoever created the Tags.

    The search performed by a Rule's trigger will pick the latest Tag by default, so to find the types of a column, just search for a Tag with Symbol info_column_types targeting the column you want:

    {
    	"type" : "tag",
    	"symbol" : "info_column_types",
    	"arguments" : {
    		0 : "myColumn"
    	}
    }
  • Say you are writing a program that works only on text data.

    You want to know if there is any column that contains text.

    It's possible that the datatype of a column changes. Maybe a column used to be text, but then it was parsed and converted to numbers later.

    We now have to search for a column while we have to make sure that the latest info_column_types Tag on it marks it as a text.

    Fortunately, this is not hard to express either:

    {
    	"type" : "tag",
    	"symbol" : "column",
    	"targeted_by" : [
    		{
    			"type" : "tag",
    			"symbol" : "info_column_types",
    			"search_postfilter" : {
    				"type" : "tag",
    				"comment_contains" : ",string,"
    			}
    		}
    	]
    }
    

You will see some examples on the next pages. Understanding this will get very intuitive with practice. By the way: If you are wondering how to delete a column Tag if they don't overwrite each other, that's what the !nullify Tag is for, which we will see later.

The next two pages will demonstrate all of this on example Rules.

If you are the type of person who learns better from practical examples, you can also have a look at the Scenario Plan (developers only) Data Exploration demonstration with example file and tutorial information. It is the first practical example written for Elody. It is intentionally complex, and was designed as a stress test to ensure that the Rule logic we use is not missing anything. You can use Developer Mode to inspect what happens in the background and have a look at the Rules and Options we use for it. Keep in mind that this example was written as a stress-test: It is far more complex than most programs will be in practice.

Note

By the way, you can create graph representations of Tag structures like the one above using the Symbol !export_object. It can be used to extract a structure of files and Tags so that it can be imported again in another Scenario. This way you can use one Scenario to perform an analysis and create Tags, while a second Scenario makes use of those Tags later.