Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a script to make review of pipeline logs easier #6

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

mhidas
Copy link
Contributor

@mhidas mhidas commented Nov 23, 2017

DO NOT MERGE WIP / Proof of concept

Trying to address some of the requirements described in https://github.com/aodn/zzz-aodn-pipeline-poc/issues/143

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in comments, I think it would be great if this could use generators everywhere with no intermediate lists to make streaming/filtering the log file(s) as efficient as possible. I think all of the work to filter/format lines can be done on a line-by-line basis?

"""
# read all log lines
with open(logfile) as log:
lines = log.readlines()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you get rid of this readlines(), and just use "for line in log:" below it will have the same result but use the built in iterator and be much more memory efficient and probably faster (especially for large files).


m = INPUT_REGEX.match(line)
if m is not None:
log_data.append(m.groupdict())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as above.. I think making this run off a generator and yield matching lines rather than creating intermediate lists would be more efficient.

"""
output = []
for data in log_data:
output.append(fmt.format(**data))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, format while iterating to avoid intermediate list.

@mhidas
Copy link
Contributor Author

mhidas commented Dec 7, 2017

Thanks @lwgordonimos I think you mentioned that earlier, and I've been working on changes along those lines, just haven't pushed them up yet. I'll get back to this soon.

mhidas added 6 commits July 24, 2018 17:26
* process log one line at a time
* create LogViewer class
* add separate log parsing and filtering methods of LogViewer class that return generatorss
* start implementing find_log() function
@gsatimos
Copy link
Contributor

@mhidas I'm doing a sweep of outstanding PRs and this one looks relatively old. Can this be closed or left as open?

@mhidas
Copy link
Contributor Author

mhidas commented Sep 3, 2018

I'd like to actually get this script to a somewhat useable state and deploy it. Just haven't had time to work on it.

@lbesnard @bpasquer @ggalibert Would you find such a script useful? It would be similar to the input_log command we have for the old pipelines. The more specific use cases I've had in mind are listed here.

@lbesnard
Copy link
Contributor

lbesnard commented Sep 3, 2018

i dunno exactly how this would work, but having an autocomplete on the list of handlers would be nice

@ghost
Copy link

ghost commented Sep 3, 2018

Autocompletion will be simple, probably by implemented a simple command line utility in aodncore.

@ggalibert
Copy link
Contributor

Agreed, this would be great. With a behaviour similar to the less command for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants