Howard Chu [Sun, 14 Jul 2013 15:20:18 +0000 (08:20 -0700)]
Fix stale sub-cursor C_INIT flag
Whenever we enter cursor_set() the sub-cursor's flag must be
cleared. If the new cursor position has valid subdata it will
be initialized again, if not then the sub-cursor has nothing
to point to.
(Restructuring for upcoming mdb_page_spill work.)
mdb_freelist_save() can't just Get() the destination, since
mdb_page_spill() may have put the destination in the read-only map.
TODO: Can this new put() modify the freelist, which would break it? The
final iteration's put() can shorten the node, the rest uses MDB_CURRENT.
We could set P_KEEP on dirty freeDB leaves and ovpages, since they are
all about to be modified. But the code in this commit must stay anyway,
if mdb should support dropping a 256G DB. I.e. too big for dirty_list.
If mdb_page_touch() sees a page in txn's dirty_list, that
is the page version txn's cursors should have. Fail if
the user may be seeing and depending on another version.
Restore mc_flags and xcursors, they were tracked but not merged.
Simplify: Track parent txn's original cursors after backing them
up, instead of tracking copies and merging them back at commit.
Page leak, mdb_page_alloc(). On error, don't shorten me_pghead.
Memleak, mdb_ovpage_free(). Free page or keep it in dirty_list.
Bad MIDL, mdb_midl_need(). Fix midl[-1] (allocated size).
Catch I/O errors. Do nothing between OS call failure and ErrCode().
Do not use errno after non-OS-errors like write() >= 0, which could
give a failure return of success (errno 0) or some irrelevant error
code. Drop seek calls, use pwrite/pread/Windows OVERLAPPED offset.
Don't put a 64-bit filesize in a 32-bit int before shifting
down. Always pass &sizehi to SetFilePointer->maxsize, so
sizelo not is treated a signed distance. Hide unused vars
when _WIN32. Reinitialize OVERLAPPED before reuse.
Grow midls earlier in order to catch errors earlier. Use
mdb_midl_need() instead of mdb_midl_grow(), then mdb_midl_xappend()
needs no error checks. Factor out mdb_midl_append_range().
MDB_env.me_pghead: Don't free it when empty. mdb_ovpage_free()
needs it, but cannot allocate it.
mdb_midl_alloc(): Fill in length=0.
mdb_page_alloc(): Also Skip freeDB if txnid<3, instead of <4,
and consistently DPRINTF consumed IDLs.
When copying, round up/down to aligned sizes. Skip the unused portion,
this was not done when touching a page dirty in the parent txn.
No other change in behavior.
Simplify mdb_page_touch(), including: Drop test m3==mc, the condition
is caught below. Don't "modify" the parent's pgno into the same pgno,
when a nested txn copies a parent's page into its freelist.
The tracking code should not change the current cursor.
It did when that was a C_SUB cursor, which should not be
checked against the tracked cursors but their xcursors.
However, do not bother to skip the tracking code for the
current cursor when it would not change that cursor anyway.
Do not binary-search dirty_list, it is unsorted when MDB_WRITEMAP.
Catch errors. In nested txns, put the page in mt_free_pgs after
all since pages dirty in a parent txn would add complexities.
Split up saving me_pghead, to make me_pgfree unneeded. Also mf_pghead
is now a midl. Needed after e7f6767ea815fe0ada1f95037dfdec176ec4d5bb
("Return fresh overflow pages to current pghead").
Tweak MDB_DEBUG freelist output, make it ascending.
Factor out mdb_find_oldest,mdb_dlist_free,dirty_list.
Do not rescan reader table (mdb_find_oldest) after "goto again".
Skip clearing dirty_list[nonzero].mid in mdb_dlist_free(); it
was not done in mdb_reset0() anyway.
mdb_page_malloc(): Add "number of pages" parameter.
mdb_page_get(): Add output param for how page was found.
Do not set return params on error.
mdb_cursor_put(): Catch mdb_page_get() error.
Prepares for next commit, no change in caller behavior other
than on mdb_page_get error.
No real change.
mdb_cursor_init() checks if it needs mx, so pass it unconditionally.
Set C_ALLOCD for shadow cursors, for clarity. (It was always set as
it should anyway from the origin cursor, which would have C_ALLOCD.)
Howard Chu [Wed, 1 May 2013 04:09:09 +0000 (21:09 -0700)]
Avoid assert
Due to underfilled branch page. We're in the process of merging/moving
nodes to it because we already know it's underfilled. Took this approach
rather than just removing the assert in mdb_page_search_root, because
that assert may yet catch other situations we don't know about.
(Although, it has been there since the original commit of mdb.c and
has never triggered any other times...)
Move key init into mdb_env_setup_locks().
Don't create unused TLS key when read-only filesystem.
Drop internal flag MDB_ROFS, we can instead test either
!me_txns, !mt_u.reader or me_lfd==INVALID_HANDLE_VALUE.