Howard Chu [Tue, 13 Aug 2013 20:12:47 +0000 (13:12 -0700)]
ITS#7664 better fix
For RDONLY, don't get lockfile until we know datafile exists.
Also, don't try to create a new datafile for me_mfd if someone
deleted it after we got me_fd.
Howard Chu [Mon, 12 Aug 2013 00:15:03 +0000 (17:15 -0700)]
Fix obscure MDB_MULTIPLE bug
If a key has a single existing value, and then a put (MDB_MULTIPLE)
is done where the first of the multiple values matches the existing
value, the put would return SUCCESS without writing any of the
values. Fixed to loop to the next value as intended.
mdb_cursorpages_mark: Mark current txn and no more.
Ignore parent txn cursors since it is the current txn's dirty_list
which will be flushed. But check the current txn also when clearing,
since cursors can have pages which are dirty in a parent.
Check !mc_xcursor instead of !MDB_DUPSORT. Equivalent for valid
data, but a bit safer if the sub-DB flags are corrupt.
Pid locking needs a different lockfile-version: MDB_env's with and
without pid locking must not coexist, they can sabotage each other.
Store MDB_LOCK_FORMAT = (version | "use locking" flag) instead.
Treat unexpected errors as "don't know". Invert Pidcheck return
value, so nonzero including error codes = "the process may exist".
On Windows: Catch exited but still existing processes. Handle
undefined PROCESS_QUERY_LIMITED_INFORMATION.
On Unix: don't trust F_GETLK error to leave the input alone,
the fcntl() doc seems unclear.
mdb_copy: Does not copy lockfile. Can trigger file growth.
mdb_txn_begin(): Clarify usage restrictions.
mdb_drop(): State what to do rather than what will be done, since
closing the handle could otherwise be read as happening even at failure.
Howard Chu [Sun, 14 Jul 2013 15:20:18 +0000 (08:20 -0700)]
Fix stale sub-cursor C_INIT flag
Whenever we enter cursor_set() the sub-cursor's flag must be
cleared. If the new cursor position has valid subdata it will
be initialized again, if not then the sub-cursor has nothing
to point to.
(Restructuring for upcoming mdb_page_spill work.)
mdb_freelist_save() can't just Get() the destination, since
mdb_page_spill() may have put the destination in the read-only map.
TODO: Can this new put() modify the freelist, which would break it? The
final iteration's put() can shorten the node, the rest uses MDB_CURRENT.
We could set P_KEEP on dirty freeDB leaves and ovpages, since they are
all about to be modified. But the code in this commit must stay anyway,
if mdb should support dropping a 256G DB. I.e. too big for dirty_list.
If mdb_page_touch() sees a page in txn's dirty_list, that
is the page version txn's cursors should have. Fail if
the user may be seeing and depending on another version.
Restore mc_flags and xcursors, they were tracked but not merged.
Simplify: Track parent txn's original cursors after backing them
up, instead of tracking copies and merging them back at commit.
Page leak, mdb_page_alloc(). On error, don't shorten me_pghead.
Memleak, mdb_ovpage_free(). Free page or keep it in dirty_list.
Bad MIDL, mdb_midl_need(). Fix midl[-1] (allocated size).
Catch I/O errors. Do nothing between OS call failure and ErrCode().
Do not use errno after non-OS-errors like write() >= 0, which could
give a failure return of success (errno 0) or some irrelevant error
code. Drop seek calls, use pwrite/pread/Windows OVERLAPPED offset.
Don't put a 64-bit filesize in a 32-bit int before shifting
down. Always pass &sizehi to SetFilePointer->maxsize, so
sizelo not is treated a signed distance. Hide unused vars
when _WIN32. Reinitialize OVERLAPPED before reuse.
Grow midls earlier in order to catch errors earlier. Use
mdb_midl_need() instead of mdb_midl_grow(), then mdb_midl_xappend()
needs no error checks. Factor out mdb_midl_append_range().
MDB_env.me_pghead: Don't free it when empty. mdb_ovpage_free()
needs it, but cannot allocate it.
mdb_midl_alloc(): Fill in length=0.
mdb_page_alloc(): Also Skip freeDB if txnid<3, instead of <4,
and consistently DPRINTF consumed IDLs.
When copying, round up/down to aligned sizes. Skip the unused portion,
this was not done when touching a page dirty in the parent txn.
No other change in behavior.
Simplify mdb_page_touch(), including: Drop test m3==mc, the condition
is caught below. Don't "modify" the parent's pgno into the same pgno,
when a nested txn copies a parent's page into its freelist.
The tracking code should not change the current cursor.
It did when that was a C_SUB cursor, which should not be
checked against the tracked cursors but their xcursors.
However, do not bother to skip the tracking code for the
current cursor when it would not change that cursor anyway.
Do not binary-search dirty_list, it is unsorted when MDB_WRITEMAP.
Catch errors. In nested txns, put the page in mt_free_pgs after
all since pages dirty in a parent txn would add complexities.
Split up saving me_pghead, to make me_pgfree unneeded. Also mf_pghead
is now a midl. Needed after e7f6767ea815fe0ada1f95037dfdec176ec4d5bb
("Return fresh overflow pages to current pghead").
Tweak MDB_DEBUG freelist output, make it ascending.