Explaining new version naming schema. (#906)

* Explaining new Serialization Versioning schema used in Badger.
dgraph-io · Jul 2, 2019 · 2fa005c · 2fa005c
1 parent 50ccc86
commit 2fa005c
Show file tree

Hide file tree

Showing 3 changed files with 170 additions and 9 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,10 +2,98 @@
 All notable changes to this project will be documented in this file.
 
 The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
-and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
+and this project adheres to [Serialization Versioning](VERSIONING.md).
 
 ## [Unreleased]
 
+## [1.6.0] - 2019-07-01
+
+This is a release including almost 200 commits, so expect many changes - some of them
+not backward compatible.
+
+Regarding backward compatibility in Badger versions, you might be interested on reading
+[VERSIONING.md](VERSIONING.md).
+
+_Note_: The hashes in parentheses correspond to the commits that impacted the given feature.
+
+### New APIs
+
+- badger.DB
+  - DropPrefix (291295e)
+  - Flatten (7e41bba)
+  - KeySplits (4751ef1)
+  - MaxBatchCount (b65e2a3)
+  - MaxBatchSize (b65e2a3)
+  - PrintKeyValueHistogram (fd59907)
+  - Subscribe (26128a7)
+  - Sync (851e462)
+
+- badger.DefaultOptions() and badger.LSMOnlyOptions() (91ce687)
+  - badger.Options.WithX methods
+
+- badger.Entry (e9447c9)
+  - NewEntry
+  - WithMeta
+  - WithDiscard
+  - WithTTL 
+
+- badger.Item
+  - KeySize (fd59907)
+  - ValueSize (5242a99)
+
+- badger.IteratorOptions
+  - PickTable (7d46029, 49a49e3)
+  - Prefix (7d46029)
+
+- badger.Logger (fbb2778)
+
+- badger.Options
+  - CompactL0OnClose (7e41bba)
+  - Logger (3f66663)
+  - LogRotatesToFlush (2237832)
+
+- badger.Stream (14cbd89, 3258067)
+- badger.StreamWriter (7116e16)
+- badger.TableInfo.KeyCount (fd59907)
+- badger.TableManifest (2017987)
+- badger.Tx.NewKeyIterator (49a49e3)
+- badger.WriteBatch (6daccf9, 7e78e80)
+
+### Modified APIs
+
+#### Breaking changes:
+
+- badger.DefaultOptions and badger.LSMOnlyOptions are now functions rather than variables (91ce687)
+- badger.Item.Value now receives a function that returns an error (439fd46)
+- badger.Txn.Commit doesn't receive any params now (6daccf9)
+- badger.DB.Tables now receives a boolean (76b5341)
+
+#### Not breaking changes:
+
+- badger.LSMOptions changed values (799c33f)
+- badger.DB.NewIterator now allows multiple iterators per RO txn (41d9656)
+- badger.Options.TableLoadingMode's new default is options.MemoryMap (6b97bac)
+
+### Removed APIs
+
+- badger.ManagedDB (d22c0e8)
+- badger.Options.DoNotCompact (7e41bba)
+- badger.Txn.SetWithX (e9447c9)
+
+### Tools:
+
+- badger bank disect (13db058)
+- badger bank test (13db058) --mmap (03870e3)
+- badger fill (7e41bba)
+- badger flatten (7e41bba)
+- badger info --histogram (fd59907) --history --lookup --show-keys --show-meta --with-prefix (09e9b63) --show-internal (fb2eed9)
+- badger benchmark read (239041e)
+- badger benchmark write (6d3b67d)
+
+## [1.5.5] - 2019-06-20
+
+* Introduce support for Go Modules
+
 ## [1.5.3] - 2018-07-11
 Bug Fixes:
 * Fix a panic caused due to item.vptr not copying over vs.Value, when looking
@@ -87,7 +175,9 @@ Bug fix:
 ## [1.0.1] - 2017-11-06
 * Fix an uint16 overflow when resizing key slice
 
-[Unreleased]: https://github.com/dgraph-io/badger/compare/v1.5.3...HEAD
+[Unreleased]: https://github.com/dgraph-io/badger/compare/v1.6.0...HEAD
+[1.6.0]: https://github.com/dgraph-io/badger/compare/v1.5.5...v1.6.0
+[1.5.5]: https://github.com/dgraph-io/badger/compare/v1.5.3...v1.5.5
 [1.5.3]: https://github.com/dgraph-io/badger/compare/v1.5.2...v1.5.3
 [1.5.2]: https://github.com/dgraph-io/badger/compare/v1.5.1...v1.5.2
 [1.5.1]: https://github.com/dgraph-io/badger/compare/v1.5.0...v1.5.1

diff --git a/README.md b/README.md
@@ -6,7 +6,7 @@ BadgerDB is an embeddable, persistent and fast key-value (KV) database
 written in pure Go. It's meant to be a performant alternative to non-Go-based
 key-value stores like [RocksDB](https://github.com/facebook/rocksdb).
 
-## Project Status [Oct 27, 2018]
+## Project Status [Jun 26, 2019]
 
 Badger is stable and is being used to serve data sets worth hundreds of
 terabytes. Badger supports concurrent ACID transactions with serializable
@@ -15,14 +15,20 @@ snapshot isolation (SSI) guarantees. A Jepsen-style bank test runs nightly for
 Badger has also been tested to work with filesystem level anomalies, to ensure
 persistence and consistency.
 
-Badger v1.0 was released in Nov 2017, with a Badger v2.0 release coming up in a
-few months. The [Changelog] is kept fairly up-to-date.
+Badger v1.0 was released in Nov 2017, and the latest version that is data-compatible
+with v1.0 is v1.6.0.
+
+Badger v2.0, a new release coming up very soon will use a new storage format which won't
+be compatible with all of the v1.x. The [Changelog] is kept fairly up-to-date.
+
+For more details on our version naming schema please read [Choosing a version](#choosing-a-version).
 
 [Changelog]:https://github.com/dgraph-io/badger/blob/master/CHANGELOG.md
 
 ## Table of Contents
  * [Getting Started](#getting-started)
     + [Installing](#installing)
+      - [Choosing a version](#choosing-a-version)
     + [Opening a database](#opening-a-database)
     + [Transactions](#transactions)
       - [Read-only transactions](#read-only-transactions)
@@ -61,6 +67,27 @@ $ go get github.com/dgraph-io/badger/...
 This will retrieve the library and install the `badger` command line
 utility into your `$GOBIN` path.
 
+#### Choosing a version
+
+BadgerDB is a pretty special package from the point of view that the most important change we can
+make to it is not on its API but rather on how data is stored on disk.
+
+This is why we follow a version naming schema that differs from Semantic Versioning.
+
+- New major versions are released when the data format on disk changes in an incompatible way.
+- New minor versions are released whenever the API changes but data compatibility is maintained.
+ Note that the changes on the API could be backward-incompatible - unlike Semantic Versioning.
+- New patch versions are released when there's no changes to the data format nor the API.
+
+Following these rules:
+
+- v1.5.0 and v1.6.0 can be used on top of the same files without any concerns, as their major
+ version is the same, therefore the data format on disk is compatible.
+- v1.6.0 and v2.0.0 are data incompatible as their major version implies, so files created with
+ v1.6.0 will need to be converted into the new format before they can be used by v2.0.0.
+
+For a longer explanation on the reasons behind using a new versioning naming schema, you can read
+[VERSIONING.md](VERSIONING.md).
 
 ### Opening a database
 The top-level object in Badger is a `DB`. It represents multiple files on disk
@@ -82,10 +109,7 @@ import (
 func main() {
   // Open the Badger database located in the /tmp/badger directory.
   // It will be created if it doesn't exist.
-  opts := badger.DefaultOptions
-  opts.Dir = "/tmp/badger"
-  opts.ValueDir = "/tmp/badger"
-  db, err := badger.Open(opts)
+  db, err := badger.Open(badger.DefaultOptions("tmp/badger"))
   if err != nil {
 	  log.Fatal(err)
   }

diff --git a/VERSIONING.md b/VERSIONING.md
@@ -0,0 +1,47 @@
+# Serialization Versioning: Semantic Versioning for databases
+
+Semantic Versioning, commonly known as SemVer, is a great idea that has been very widely adopted as
+a way to decide how to name software versions. The whole concept is very well summarized on
+semver.org with the following lines:
+
+> Given a version number MAJOR.MINOR.PATCH, increment the:
+> 
+> 1. MAJOR version when you make incompatible API changes,
+> 2. MINOR version when you add functionality in a backwards-compatible manner, and
+> 3. PATCH version when you make backwards-compatible bug fixes.
+> 
+> Additional labels for pre-release and build metadata are available as extensions to the
+> MAJOR.MINOR.PATCH format.
+
+Unfortunately, API changes are not the most important changes for libraries that serialize data for
+later consumption. For these libraries, such as BadgerDB, changes to the API are much easier to
+handle than change to the data format used to store data on disk.
+
+## Serialization Version specification
+
+Serialization Versioning, like Semantic Versioning, uses 3 numbers and also calls them
+MAJOR.MINOR.PATCH, but the semantics of the numbers are slightly modified:
+
+Given a version number MAJOR.MINOR.PATCH, increment the:
+
+- MAJOR version when you make changes that require a transformation of the dataset before it can be
+used again.
+- MINOR version when old datasets are still readable but the API might have changed in
+backwards-compatible or incompatible ways.
+- PATCH version when you make backwards-compatible bug fixes.
+
+Additional labels for pre-release and build metadata are available as extensions to the
+MAJOR.MINOR.PATCH format.
+
+Following this naming strategy, migration from v1.x to v2.x requires a migration strategy for your
+existing dataset, and as such has to be carefully planned. Migrations in between different minor
+versions (e.g. v1.5.x and v1.6.x) might break your build, as the API *might* have changed, but once
+your code compiles there's no need for any data migration. Lastly, changes in between two different
+patch versions should never break your build or dataset.
+
+For more background on our decision to adopt Serialization Versioning, read the blog post
+[Semantic Versioning, Go Modules, and Databases][blog] and the original proposal on
+[this comment on Dgraph's Discuss forum][discuss].
+
+[blog]: https://blog.dgraph.io/post/serialization-versioning/
+[discuss]: https://discuss.dgraph.io/t/go-modules-on-badger-and-dgraph/4662/7