Skip to content

Latest commit

 

History

History
730 lines (534 loc) · 27.5 KB

README.md

File metadata and controls

730 lines (534 loc) · 27.5 KB

ATSD Python Client

Table of Contents

Overview

ATSD Python Client enables Python developers to read statistics and metadata from Axibase Time Series Database as well as build reporting, analytics, and alerting solutions. ATSD Python Client supports various request methods, for interfacing with the database, such as SQL or REST API endpoints.

External References

Requirements

Check Python version.

python3 -V

The ATSD client supports Python ≥ 3.5.0.

If necessary, install pip3 (pip for Python 3) with apt-get install python-pip3 on Ubuntu.

Installation

Installing Module with pip3

Install the latest atsd_client module with pip3.

pip3 install atsd_client

Upgrade setup tools with pip3 install --upgrade setuptools.

Other Versions

Include a version number in the pip3 install command to install a specific version number other than the latest.

pip3 install atsd_client==2.3.0

Use this command to downgrade the module as well.

Module Version

Check atsd_client module version.

pip3 show atsd-client
Name: atsd-client
Version: 3.0.0
Summary: Axibase Time Series Database API Client for Python
Home-page: https://github.com/axibase/atsd-api-python
Author: Axibase Corporation
Author-email: [email protected]
License: Apache 2.0
Location: /usr/local/lib/python3.5/dist-packages
Requires: tzlocal, requests, python-dateutil
Required-by:

To install the client on a system without Internet access, follow the Offline Installation Instructions.

Installing from Source

Clone the repository and run installation manually.

git clone https://github.com/axibase/atsd-api-python.git
cd atsd-api-python
python3 setup.py install

Verify Installation

Confirm all required modules are installed.

python3 -c "import tzlocal, requests, dateutil, atsd_client"

Empty output indicates successful installation. Otherwise, the output displays an error which enumerates missing modules.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named atsd_client

Upgrade

Execute pip3 install to upgrade client to the latest version.

pip3 install atsd_client --upgrade

Execute pip3 list to view currently installed modules.

pip3 list
Package             Version
------------------- ------------------
asn1crypto          0.24.0
atsd-client         3.0.0
certifi             2018.4.16
cffi                1.11.5
...

Connection Test

Create connect_url_check.py which contains a basic connection test.

from atsd_client import connect_url

# Update connection properties and user credentials
connection = connect_url('https://atsd_hostname:8443', 'john.doe', 'password')

# Retrieve JSON from '/api/v1/version' endpoint
# https://axibase.com/docs/atsd/api/meta/misc/version.html
response = connection.get('v1/version')
build_info = response['buildInfo']
print('Revision: %s ' % build_info['revisionNumber'])

Navigate to the directory of the connect_url_check.py file and execute the test.

cd ./path/to/connect_url_check.py
python3 connect_url_check.py

Console indicates successful connection:

INFO:root:Connecting to ATSD at https://atsd_hostname:8443 as john.doe user.
Revision: 19###

Connecting to ATSD

To connect to an ATSD instance, hostname and port information is required. By default, ATSD listens for connection requests on port 8443.

Create a user account on the Settings > Users page, if needed.

Establish a connection with the connect_url method.

from atsd_client import connect_url
connection = connect_url('https://atsd_hostname:8443', 'john.doe', 'password')

Alternatively, create a connection.properties file.

base_url=https://atsd_hostname:8443
username=john.doe
password=password
ssl_verify=False

Launch Python and specify the path to the file connection.properties in the connect method.

from atsd_client import connect
connection = connect('/path/to/connection.properties')

Debug

Specify the DEBUG argument before import atsd_client to include logs in console output:

import logging
logging.basicConfig(level=logging.DEBUG)
import atsd_client
DEBUG:root:Checking 'python-requests' version...
DEBUG:root:Module 'python-requests' version is 2.19.1. The version is compatible.
DEBUG:root:Connecting to ATSD at https://localhost:8443 as axibase user.
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): localhost:8443
DEBUG:urllib3.connectionpool:https://localhost:8443 "GET /api/v1/entities?tags=%2A&expression=createdDate+%3E+%272018-05-16T00%3A00%3A00Z%27 HTTP/1.1" 200 None

Services

The client supports services to insert and query particular types of records in the database. These include Series, Property, and Message records as well as metadata records such as Entity, Metric, and EntityGroup.

from atsd_client.services import *
svc = SeriesService(conn)

Available services:

Models

Use the service to insert and query particular types of records in the database, which are implemented as Python classes.

Inserting Data

Inserting Series

Initialize a Series object and populate the object with timestamped values.

from atsd_client.models import Series, Sample

series = Series(entity='sensor123', metric='temperature')
series.add_samples(
    Sample(value=1, time="2018-05-18T17:14:30Z"),
    Sample(value=2, time="2018-05-18T17:16:30Z")
)
svc.insert(series)

Inserting Properties

Initialize a Property object.

from atsd_client.models import Property

property = Property(type='disk', entity='nurswgvml007',
                    key={"mount_point": "sda1"},
                    tags={"fs_type": "ext4"})

svc = PropertiesService(conn)
svc.insert(property)

Inserting Messages

Initialize a Message object.

from atsd_client.models import Message

message = Message(entity='nurswgvml007', type="application", source="atsd", severity="MAJOR",
                  tags={"path": "/", "name": "sda"},
                  message="connect_to localhost port 8881 failed.")

svc = MessageService(conn)
svc.insert(message)

Querying Data

Querying Series

To query database series, pass the following filters to the SeriesService:

  • SeriesFilter: Required. Defines metric name. Alternatively, include data type, series tags, and other parameters.
  • EntityFilter: Optional. Accepts a single entity name, an array of multiple entity names, an entity group name, or an expression to filter entities.
  • DateFilter: Specifies startDate, endDate, and interval. Provide either startDate and endDate fields or either startDate or endDate and interval or only interval to define period. If only interval is defined, current time is set as endtime. Provide startDate and endDate fields as calendar syntax keywords, an ISO 8601 formatted string, Unix milliseconds, or a Python datetime object.
from atsd_client.models import *

sf = SeriesFilter(metric="temperature")
ef = EntityFilter(entity="sensor123")
df = DateFilter(start_date="2018-02-22T13:37:00Z", end_date=datetime.now())
query_data = SeriesQuery(series_filter=sf, entity_filter=ef, date_filter=df)
svc = SeriesService(conn)
result = svc.query(query_data)

# Print first Series object
print(result[0])
2018-07-18T17:14:30+00:00             1
2018-07-18T17:16:30+00:00             2
metric: temperature
entity: sensor123
tags: tz=local

Optional filters:

Refer to API Documentation for additional details.

Querying Data with SQL

To perform SQL queries, use the query method implemented in SQLService. The returned table is an instance of the DataFrame class.

from atsd_client import connect_url
from atsd_client.services import SQLService

conn = connect_url('https://atsd_hostname:8443', 'user', 'passwd')

# Single-line SQL query
# query = 'SELECT datetime, time, entity, value FROM jvm_memory_free LIMIT 3';

# Multi-line SQL query, enclose in triple quotes (single or double)
query = """
SELECT datetime, time, entity, value
  FROM "jvm_memory_free"
ORDER BY datetime DESC
  LIMIT 3
"""

svc = SQLService(conn)
df = svc.query(query)

print(df)
                   datetime           time entity      value
0  2018-05-17T12:36:36.971Z  1526560596971   atsd  795763936
1  2018-05-17T12:36:21.970Z  1526560581970   atsd  833124808
2  2018-05-17T12:36:06.973Z  1526560566973   atsd  785932984

Pandas options used by atsd_client:

'display.expand_frame_repr' = False

Querying Properties

To retrieve property records from the database, specify the property type name and pass the following filters to the PropertiesService:

  • EntityFilter: Accepts a single entity name, an array of multiple entity names, an entity group name, or an expression to filter entities.
  • DateFilter: Specifies startDate, endDate, and interval fields. Provide either startDate and endDate fields or either startDate or endDate and interval or only interval to define period. If only interval is defined, current time is set as endtime. Provide startDate and endDate fields as calendar syntax keywords, an ISO 8601 formatted string, Unix milliseconds, or a Python datetime object.
from atsd_client.models import *

ef = EntityFilter(entity="nurswgvml007")
df = DateFilter(start_date="today", end_date="now")
query = PropertiesQuery(type="disk", entity_filter=ef, date_filter=df)
svc = PropertiesService(conn)
result = svc.query(query)

# Print first Property object
print(result[0])
type: disk
entity: nurswgvml007
key: command=com.axibase.tsd.Server
tags: fs_type=ext4
date: 2018-05-21 14:46:42.728000+00:00

Optionally use additional property filter fields in PropertiesQuery, for example, key and key_tag_expression.

Refer to API Documentation for additional details.

Querying Messages

To query messages, initialize a MessageQuery object and pass it to the MessageService with the following filters:

  • EntityFilter: Accepts a single entity name, an array of multiple entity names, an entity group name, or an expression to filter entities.
  • DateFilter: Specifies startDate, endDate, and interval fields. Provide either startDate and endDate fields or either startDate or endDate and interval or only interval to define period. If only interval is defined, current time is set as endtime. Provide startDate and endDate fields as calendar syntax keywords, an ISO 8601 formatted string, Unix milliseconds, or a Python datetime object.
  • Additional filter fields: type, source, severity, and tags. To select records with a non-empty value for the given tag, set the filter value to * wildcard.
from atsd_client.models import *

ef = EntityFilter(entity="nurswgvml007")
df = DateFilter(start_date="today", end_date="now")
query = MessageQuery(entity_filter=ef, date_filter=df, type="application", tags={"syslog": "*"}, limit=1000)
svc = MessageService(conn)
messages = svc.query(query)

print("received messages: ", len(messages))

for msg in messages:
  print(msg)
entity: nurswgvml007
type: application
source: atsd
date: 2018-05-21 15:42:04.452000+00:00
severity: MAJOR
tags: syslog=ssh
message: connect_to localhost port 8881 failed.
persist: True

Refer to API Documentation for additional details.

Querying Portal

To export a portal use the get_portal() method declared in PortalsService:

ps = PortalsService(connection)
ps.get_portal(id=192, entity="atsd", width=1000, heigth=700, portal_file="192.png", theme="default")

Pass additional parameters to the target portal as key=value pairs:

# Pass tag value (it can be accessed as ${fs_type})
ps.get_portal(name="ActiveMQ", entity="atsd", fs_type="ext4")

By default portal_file is set to {portal_name}_{entity_name}_{yyyymmdd}.png, for example ATSD_nurswghbs001_20181012.png.

Analyzing Data

Convert to pandas

Install the pandas module for advanced data manipulation and analysis.

pip3 install pandas

Use Pandas set_option to format output:

import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.width', 2000)
pd.set_option('max_rows', None)
pd.set_option('max_columns', None)
pd.set_option('max_colwidth', -1)
pd.set_option('display.expand_frame_repr', False)

Series

Access the Series object in Pandas with the built-in to_pandas_series() and from_pandas_series() methods.

ts = series.to_pandas_series()

# 'pandas.tseries.index.DatetimeIndex'
print(ts)
2018-04-10 17:22:24.048000    11
2018-04-10 17:23:14.893000    31
2018-04-10 17:24:49.058000     7
2018-04-10 17:25:15.567000    22
2018-04-13 14:00:49.285000     9
2018-04-13 15:00:38            3

Entities

To retrieve Entity list as Pandas DataFrame use query_dataframe method:

entities = svc.query_dataframe(expression="createdDate > '2018-05-16T00:00:00Z'")

print(entities)
                createdDate  enabled            lastInsertDate          name
0  2018-07-12T14:52:21.599Z     True  2018-07-23T15:39:51.542Z  nurswgvml007
1  2018-07-17T20:08:02.213Z     True  2018-07-17T20:08:04.813Z  nurswghbs001
2  2018-07-12T14:52:21.310Z     True  2018-07-23T15:39:49.164Z          atsd

Pandas options used by atsd_client:

'display.expand_frame_repr' = False
'max_colwidth' = -1

Messages

To retrieve Message records as Pandas DataFrame use query_dataframe method:

messages = svc.query_dataframe(query, columns=['entity', 'date', 'message'])

print(messages)
         entity                             date                                            message
0  nurswgvml007 2018-07-17 18:49:24.749000+03:00  Scanned 0 directive(s) and 0 block(s) in 0 mil...
1  nurswgvml007 2018-07-17 18:48:24.790000+03:00  Scanned 0 directive(s) and 0 block(s) in 0 mil...
2  nurswgvml007 2018-07-17 18:48:16.129000+03:00                Indexing started, type: incremental

Pandas options used by atsd_client:

'display.expand_frame_repr' = False
'max_colwidth' = -1

Properties

To retrieve Property records as Pandas DataFrame use query_dataframe method:

properties = svc.query_dataframe(query)

print(properties)
                       date        entity    id  type  fs_type
0  2018-07-23T15:31:03.000Z  nurswgvml007   fd0  disk     ext3
1  2018-07-23T15:31:03.000Z  nurswgvml007   sda  disk     ext4
2  2018-07-23T15:31:03.000Z  nurswgvml007  sda1  disk     ext4

Pandas options used by atsd_client:

'display.expand_frame_repr' = False
'max_colwidth' = -1

Graph Results

To plot a series with matplotlib, use the plot() function:

>>> import matplotlib.pyplot as plt
>>> series.plot()
>>> plt.show()

Working with Versioned Data

Versioning tracks time series value changes for the purpose of audit trail and data reconciliation.

Enable versioning for specific metrics and add optional versioning fields to samples which contain the version argument.

from datetime import datetime
other_series = Series('sensor123', 'power')
other_series.add_samples(
    Sample(3, datetime.now(), version={"source":"TEST_SOURCE", "status":"TEST_STATUS"})
)

To retrieve series values with versioning fields, add the VersionedFilter to the query and enable the versioned field.

import time
from atsd_client.models import *

cur_unix_milliseconds = int(time.time() * 1000)
sf = SeriesFilter(metric="power")
ef = EntityFilter(entity="sensor123")
df = DateFilter(startDate="2018-02-22T13:37:00Z", endDate=cur_unix_milliseconds)
vf = VersioningFilter(versioned=True)

query_data = SeriesQuery(series_filter=sf, entity_filter=ef, date_filter=df, versioning_filter=vf)
result = svc.query(query_data)

print(result[0])
           time         value   version_source   version_status
1468868125000.0           3.0      TEST_SOURCE      TEST_STATUS
1468868140000.0           4.0      TEST_SOURCE      TEST_STATUS
1468868189000.0           2.0      TEST_SOURCE      TEST_STATUS
1468868308000.0           1.0      TEST_SOURCE      TEST_STATUS
1468868364000.0          15.0      TEST_SOURCE      TEST_STATUS
1468868462000.0          99.0      TEST_SOURCE      TEST_STATUS
1468868483000.0          54.0      TEST_SOURCE      TEST_STATUS

See Versioning Documentation for more information.

Examples

Preparation

Name Description
version_check.py Print Python and module version information.

Connection

Name Description
connect_url_check.py Connect to the target ATSD instance, retrieve database version, timezone and current time with the connect_url('https://atsd_hostname:8443', 'user', 'password') function.
connect_path_check.py Connect to the target ATSD instance, retrieve database version, timezone and current time with the connect(/home/axibase/connection.properties) function.
connect_check.py Connect to the target ATSD instance, retrieve database version, timezone and current time with the connect() function.

Inserting Records

Name Description
nginx_access_log_tail.py Continuously read nginx access.log via tail -F, parse request logs as CSV rows, discard bot requests, insert records as messages.

Data Availability

Name Description
find_broken_retention.py Find series that ignore metric retention days.
metrics_without_last_insert.py Find metrics without a last insert date.
entities_without_last_insert.py Find entities without a last insert date.
find_series_by_value_filter.py Retrieve series matching value filter expression.
find_lagging_series_for_entity_expression.py Find entities that match the specified expression filter which have not been updated for more than one day.
find_lagging_series_for_entity.py Find series for the specified entity that have not been updated for more than one day.
find_lagging_series_for_metric.py Find series for the specified metric that have not been updated for more than one day.
find_lagging_series.py Find series with last insert date which lags behind the maximum last insert date by more than the specified interval.
high_cardinality_series.py Find series with high cardinality of tag combinations.
high_cardinality_metrics.py Find metrics with high cardinality of tag combinations.
find_lagging_entities.py Find entities that match the specified expression filter which no longer collect data.
find_stale_agents.py Find entities which no longer collect data for a subset of metrics.
metrics_created_later_than.py Find metrics created after the specified date.
entities_created_later_than.py Find entities created after the specified date.
find_delayed_entities.py Find entities more than N hours behind the metric last_insert_date.
series_statistics.py Retrieve series for a given metric, for each series fetch first and last value.
frequency_violate.py Print values that violate metric frequency.
migration.py Compare series query responses before and after ATSD migration.
data-availability.py Monitor availability of data for parameters defined in data-availability.csv.

Data Manipulation

Name Description
copy_data.py Copy data to a new period.
copy_data_for_the_metric.py Copy data to a new metric.
transforming_schema.py Copy data with transforming schema.

Data Removal and Cleanup

Name Description
find_non-positive_values.py Find series with non-positive values for the specified metric, and optionally delete.
delete_series.py Delete samples for the given metric, entity, and any tags within the specified date interval.
delete_series_data_interval.py Delete data for a given series with tags within the specified date interval.
delete_series_for_all_entity_metrics.py Delete series for all metrics for the specified entity with names beginning with the specified prefix.
delete_series_for_entity_metric_tags.py Delete all series for the specified entity, metric and series tags.
docker_delete.py Delete docker host entities and related container/image/network/volume entities without data insertion during the previous seven days.
entities_expression_delete.py Delete entities that match the specified expression filter.
delete_entity_tags.py Delete specific entity tags from entities that match the specified expression filter.
delete_entity_tags_starting_with_expr.py Delete entity tags beginning with the specified expression filter.
update_entity_tags_from_property.py Update entity tags from the corresponding property tags.

Reports

Name Description
sql_query.py Execute SQL query and convert results into a DataFrame.
entity_print_metrics_html.py Print metrics for entity into HTML or ASCII table.
export_messages.py Export messages into CSV.
export_portals_for_docker_hosts.py Export a template portal by name for all entities that are docker hosts.
message_dataframe.py Execute Message query and convert results into a DataFrame.
message_dataframe_filtered.py Execute Message query, convert results into a DataFrame, group by tag and filter.
message_dataframe_filtered_and_ordered.py Execute Message query, convert results into a DataFrame, group by tag, filter, and sort by date.
message_referrer_report.py Query messages convert result into a HTML table.

Some of the examples above use the PrettyTable module to format displayed records.

pip3 install PrettyTable
# pip3 install https://pypi.python.org/packages/source/P/PrettyTable/prettytable-0.7.2.tar.gz