Skip to content

Commit

Permalink
Explaining new version naming schema. (#906)
Browse files Browse the repository at this point in the history
* Explaining new Serialization Versioning schema used in Badger.
  • Loading branch information
Francesc Campoy authored and danielmai committed Jul 2, 2019
1 parent 50ccc86 commit 2fa005c
Show file tree
Hide file tree
Showing 3 changed files with 170 additions and 9 deletions.
94 changes: 92 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,98 @@
All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
and this project adheres to [Serialization Versioning](VERSIONING.md).

## [Unreleased]

## [1.6.0] - 2019-07-01

This is a release including almost 200 commits, so expect many changes - some of them
not backward compatible.

Regarding backward compatibility in Badger versions, you might be interested on reading
[VERSIONING.md](VERSIONING.md).

_Note_: The hashes in parentheses correspond to the commits that impacted the given feature.

### New APIs

- badger.DB
- DropPrefix (291295e)
- Flatten (7e41bba)
- KeySplits (4751ef1)
- MaxBatchCount (b65e2a3)
- MaxBatchSize (b65e2a3)
- PrintKeyValueHistogram (fd59907)
- Subscribe (26128a7)
- Sync (851e462)

- badger.DefaultOptions() and badger.LSMOnlyOptions() (91ce687)
- badger.Options.WithX methods

- badger.Entry (e9447c9)
- NewEntry
- WithMeta
- WithDiscard
- WithTTL

- badger.Item
- KeySize (fd59907)
- ValueSize (5242a99)

- badger.IteratorOptions
- PickTable (7d46029, 49a49e3)
- Prefix (7d46029)

- badger.Logger (fbb2778)

- badger.Options
- CompactL0OnClose (7e41bba)
- Logger (3f66663)
- LogRotatesToFlush (2237832)

- badger.Stream (14cbd89, 3258067)
- badger.StreamWriter (7116e16)
- badger.TableInfo.KeyCount (fd59907)
- badger.TableManifest (2017987)
- badger.Tx.NewKeyIterator (49a49e3)
- badger.WriteBatch (6daccf9, 7e78e80)

### Modified APIs

#### Breaking changes:

- badger.DefaultOptions and badger.LSMOnlyOptions are now functions rather than variables (91ce687)
- badger.Item.Value now receives a function that returns an error (439fd46)
- badger.Txn.Commit doesn't receive any params now (6daccf9)
- badger.DB.Tables now receives a boolean (76b5341)

#### Not breaking changes:

- badger.LSMOptions changed values (799c33f)
- badger.DB.NewIterator now allows multiple iterators per RO txn (41d9656)
- badger.Options.TableLoadingMode's new default is options.MemoryMap (6b97bac)

### Removed APIs

- badger.ManagedDB (d22c0e8)
- badger.Options.DoNotCompact (7e41bba)
- badger.Txn.SetWithX (e9447c9)

### Tools:

- badger bank disect (13db058)
- badger bank test (13db058) --mmap (03870e3)
- badger fill (7e41bba)
- badger flatten (7e41bba)
- badger info --histogram (fd59907) --history --lookup --show-keys --show-meta --with-prefix (09e9b63) --show-internal (fb2eed9)
- badger benchmark read (239041e)
- badger benchmark write (6d3b67d)

## [1.5.5] - 2019-06-20

* Introduce support for Go Modules

## [1.5.3] - 2018-07-11
Bug Fixes:
* Fix a panic caused due to item.vptr not copying over vs.Value, when looking
Expand Down Expand Up @@ -87,7 +175,9 @@ Bug fix:
## [1.0.1] - 2017-11-06
* Fix an uint16 overflow when resizing key slice

[Unreleased]: https://github.com/dgraph-io/badger/compare/v1.5.3...HEAD
[Unreleased]: https://github.com/dgraph-io/badger/compare/v1.6.0...HEAD
[1.6.0]: https://github.com/dgraph-io/badger/compare/v1.5.5...v1.6.0
[1.5.5]: https://github.com/dgraph-io/badger/compare/v1.5.3...v1.5.5
[1.5.3]: https://github.com/dgraph-io/badger/compare/v1.5.2...v1.5.3
[1.5.2]: https://github.com/dgraph-io/badger/compare/v1.5.1...v1.5.2
[1.5.1]: https://github.com/dgraph-io/badger/compare/v1.5.0...v1.5.1
Expand Down
38 changes: 31 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ BadgerDB is an embeddable, persistent and fast key-value (KV) database
written in pure Go. It's meant to be a performant alternative to non-Go-based
key-value stores like [RocksDB](https://github.com/facebook/rocksdb).

## Project Status [Oct 27, 2018]
## Project Status [Jun 26, 2019]

Badger is stable and is being used to serve data sets worth hundreds of
terabytes. Badger supports concurrent ACID transactions with serializable
Expand All @@ -15,14 +15,20 @@ snapshot isolation (SSI) guarantees. A Jepsen-style bank test runs nightly for
Badger has also been tested to work with filesystem level anomalies, to ensure
persistence and consistency.

Badger v1.0 was released in Nov 2017, with a Badger v2.0 release coming up in a
few months. The [Changelog] is kept fairly up-to-date.
Badger v1.0 was released in Nov 2017, and the latest version that is data-compatible
with v1.0 is v1.6.0.

Badger v2.0, a new release coming up very soon will use a new storage format which won't
be compatible with all of the v1.x. The [Changelog] is kept fairly up-to-date.

For more details on our version naming schema please read [Choosing a version](#choosing-a-version).

[Changelog]:https://github.com/dgraph-io/badger/blob/master/CHANGELOG.md

## Table of Contents
* [Getting Started](#getting-started)
+ [Installing](#installing)
- [Choosing a version](#choosing-a-version)
+ [Opening a database](#opening-a-database)
+ [Transactions](#transactions)
- [Read-only transactions](#read-only-transactions)
Expand Down Expand Up @@ -61,6 +67,27 @@ $ go get github.com/dgraph-io/badger/...
This will retrieve the library and install the `badger` command line
utility into your `$GOBIN` path.

#### Choosing a version

BadgerDB is a pretty special package from the point of view that the most important change we can
make to it is not on its API but rather on how data is stored on disk.

This is why we follow a version naming schema that differs from Semantic Versioning.

- New major versions are released when the data format on disk changes in an incompatible way.
- New minor versions are released whenever the API changes but data compatibility is maintained.
Note that the changes on the API could be backward-incompatible - unlike Semantic Versioning.
- New patch versions are released when there's no changes to the data format nor the API.

Following these rules:

- v1.5.0 and v1.6.0 can be used on top of the same files without any concerns, as their major
version is the same, therefore the data format on disk is compatible.
- v1.6.0 and v2.0.0 are data incompatible as their major version implies, so files created with
v1.6.0 will need to be converted into the new format before they can be used by v2.0.0.

For a longer explanation on the reasons behind using a new versioning naming schema, you can read
[VERSIONING.md](VERSIONING.md).

### Opening a database
The top-level object in Badger is a `DB`. It represents multiple files on disk
Expand All @@ -82,10 +109,7 @@ import (
func main() {
// Open the Badger database located in the /tmp/badger directory.
// It will be created if it doesn't exist.
opts := badger.DefaultOptions
opts.Dir = "/tmp/badger"
opts.ValueDir = "/tmp/badger"
db, err := badger.Open(opts)
db, err := badger.Open(badger.DefaultOptions("tmp/badger"))
if err != nil {
log.Fatal(err)
}
Expand Down
47 changes: 47 additions & 0 deletions VERSIONING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Serialization Versioning: Semantic Versioning for databases

Semantic Versioning, commonly known as SemVer, is a great idea that has been very widely adopted as
a way to decide how to name software versions. The whole concept is very well summarized on
semver.org with the following lines:

> Given a version number MAJOR.MINOR.PATCH, increment the:
>
> 1. MAJOR version when you make incompatible API changes,
> 2. MINOR version when you add functionality in a backwards-compatible manner, and
> 3. PATCH version when you make backwards-compatible bug fixes.
>
> Additional labels for pre-release and build metadata are available as extensions to the
> MAJOR.MINOR.PATCH format.
Unfortunately, API changes are not the most important changes for libraries that serialize data for
later consumption. For these libraries, such as BadgerDB, changes to the API are much easier to
handle than change to the data format used to store data on disk.

## Serialization Version specification

Serialization Versioning, like Semantic Versioning, uses 3 numbers and also calls them
MAJOR.MINOR.PATCH, but the semantics of the numbers are slightly modified:

Given a version number MAJOR.MINOR.PATCH, increment the:

- MAJOR version when you make changes that require a transformation of the dataset before it can be
used again.
- MINOR version when old datasets are still readable but the API might have changed in
backwards-compatible or incompatible ways.
- PATCH version when you make backwards-compatible bug fixes.

Additional labels for pre-release and build metadata are available as extensions to the
MAJOR.MINOR.PATCH format.

Following this naming strategy, migration from v1.x to v2.x requires a migration strategy for your
existing dataset, and as such has to be carefully planned. Migrations in between different minor
versions (e.g. v1.5.x and v1.6.x) might break your build, as the API *might* have changed, but once
your code compiles there's no need for any data migration. Lastly, changes in between two different
patch versions should never break your build or dataset.

For more background on our decision to adopt Serialization Versioning, read the blog post
[Semantic Versioning, Go Modules, and Databases][blog] and the original proposal on
[this comment on Dgraph's Discuss forum][discuss].

[blog]: https://blog.dgraph.io/post/serialization-versioning/
[discuss]: https://discuss.dgraph.io/t/go-modules-on-badger-and-dgraph/4662/7

0 comments on commit 2fa005c

Please sign in to comment.