-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notes and questions on logs from reading #5
Comments
Hi @Cleop, thank you for opening this issue and summarising your knowledge quest! 🎉 Stoked that you found the Example created by @Danwhy clear and followed it on your Answers to your Questions (above)
There is no need to confuse beginners with the term "write-ahead log" https://en.wikipedia.org/wiki/Write-ahead_logging ("TMI"); it's just an append-only log. In general/practice, log compaction is never needed for most companies/products these days.
tl;drSome Data gets Updated, Most Never DoesMany developers have an "observation bias" toward data changing (mutability) because we often change the data we interact with and we therefore think that mutability is the "norm" but it really is not. Even when data appears to be mutable (from the User's perspective) it is often stored in an immutable log so the underlying store is immutable but the UI/UX appears that data is being mutated because the UI only displays the latest version of the data. In many popular web frameworks, preserving "history" of a record (or piece of content) Examples in Every IndustryExamples of use-cases where Append-only Logs are useful were included To those examples we can add:
Fact is: that append-only logs while not often mentioned by name, are in fact the norm.
Address Book ExampleIn the example given in the If Thor decides to leave his parents' house (planet)
Thanks @mathiasbynens for this handy character/byte counter: https://mothereff.in/byte-counter 🥇
Historical Context of Mutable Data: It Was Too Expensive to Store EverythingTo understand why many Web Application frameworks (still) over-write data when an Today we can store an incredible amount of data on a MicroSD card the size of a postage stamp:
|
As someone who is new to logs I could follow the example. However whilst discussing the subject area with others it brought up key terms and subject areas that were new to me. I think it would be useful to include some of this context in the readme for those who may stumble upon this repo without knowing what it is first.
What is a log?
AKA write-ahead log, commit log, transaction log. In this repo it will not refer to 'application logging', the kind of logging you might see for error messages.
A log is one of the most simple possible storage abstractions. An append-only, totally-ordered sequence of records ordered by time. They are visualised horizontally from left to right.
They're not all that different from a file or a table. If we consider a file as an array of bytes and a table as an array of records. Then a log can be thought of as a kind of table where records are sorted by time.
Logs are event driven. They record what happened and when continuously. As the records are stored in the order that the changes occurred this means that at any point you can revert back to a given point in time by finding it in your records. They can do this in near real-time, making them ideal for analytics. They are also helpful in the event of crashes or errors as their record of the state of the data at all times means data can easily be restored. By keeping an immutable log of the history of your data it means your data is kept clean and is never lost or changed. The log is added to by publishers of data and used / acted upon by subscribers but the records themselves cannot be mutated.
Keywords
Time series database: a database system optimised for handling time series data (arrays of numbers indexed by time). They handle queries for historical data/ time zones better than relational dbs.
Data integration: making all the data an organisation has available in all its services and systems.
Log compaction: methods to tidy up a log by deleting no longer needed data.
Questions
These notes and questions came from reading:
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
The text was updated successfully, but these errors were encountered: