A peek under the hood in Infobright 3.2 storage engine
By Osma on Monday 21 September 2009, 15:50 - Permalink
I've been meaning to post some real-world data on the performance of the Infobright 3.2 release which happened a few weeks ago after an extended release candidate period. We're just preparing our upgrades now, so I don't have any performance notes over significant data sets or complicated queries to post quite yet.
To make up for that, I decided to address a particular annoyance of mine in the community edition, first because it hadn't been addressed in the 3.2 release (and really, I'm hoping doing this would include it into 3.2.1), and second, simply because the engine being open source means I can. I feel being OSS is one of Infobright's biggest strengths, in addition to being a pretty amazing piece of performance for such a simple, undemanding package in general, and not making use of that would be shame. Read on for details.
The annoyance? It's pretty difficult to tell, as a user, what the engine is doing while it's running queries. EXPLAIN isn't hooked up and falls back to the general MySQL code path (which, due to the storage engine not exporting any index information, simply thinks any query will be a full table scan). SHOW PROCESSLIST status data on every query simply says "init" for the entire duration of the queries, which could be minutes at a time. It does write quite a lot of detailed information into an optional debug log, but that's on the database server, inaccessible to the user and application, as well as being rather hard to read.
Fortunately, the existence of those debug statements meant it was very easy to find the places into which I could insert some status instrumentation for the process list. This is certainly not perfect - this doesn't help telling about execution paths before running a query, and the convention for process list status is far more terse than what the debug output of the engine could produce. I could have simply copied the same detail level into the process list, but that doesn't seem to be the norm in MySQL engines, and assuming that Infobright will later include the SHOW PROFILES feature, would not be helpful anyway.
The patch is below (or download it as raw text), and it applies on top of 3.2 src package downloadable at the Infobright.org site. Builds with 'make EDITION=community release' and works for me, but use this at your own risk. Please do post notes and comments, though, I'd be interested to hear about other users. I'm sure the patch could be much improved, too.
Now, what would be really interesting was if the debug log's information of the knowledge grid evaluation could be turned into EXPLAIN output, but that would require more understanding of MySQL internals than what I have...
This was the first time I looked at the source code for Infobright, and the second or third time I did so for MySQL in general. ICE is pretty impressive also in its techniques, not only being the only integrated columnar engine, but also having more join strategies than other engines I've used, and so forth. The code is tough to follow though, and the source package included a huge amount of unused stuff, like a copy of both the InnoDB and NDB storage engines, neither of which is built from the code base. I guess a bit of clean-up would make this somewhat more approachable..
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerGeneral.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerGeneral.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerGeneral.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerGeneral.cpp
2009-09-18 12:13:23.001506795 +0300
@@ -26,6 +26,7 @@
mind->Empty();
return no_desc;
// all done
}
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"joining");
//if(desc[0].val1.vc->IsConst() &&
desc[0].val2.vc == NULL) {
// // Special case: if there is a
chance for one-dimensional filtering, execute one condition only.
// no_desc = 1;
@@ -48,7 +49,8 @@ int JoinerGeneral::ExecuteJoinConditions
bool loc_result;
bool stop_execution =
false; // early stop for LIMIT
rccontrol.lock(m_conn.GetThreadID())
<< "Starting joiner loop (" << mit.NoTuples() << " rows)."
<< unlock;
-
+
thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "in joiner loop");
+
// The main loop for checking
conditions
while(mit.IsValid()) {
if(mit.PackrowStarted()) {
diff -ur infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerHash.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerHash.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerHash.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerHash.cpp
2009-09-18 15:31:15.263505407 +0300
@@ -55,6 +55,7 @@
/////////////////// Prepare all descriptor information
/////////////
// TODO: prepare a common language for both joined
columns, if not compatible
bool first_found = true;
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "hash
join");
DimensionVector dims1(mind);
// Initial dimension descriptions
DimensionVector dims2(mind);
for(int i = 0; i < desc.size(); i++) {
@@ -203,6 +204,7 @@
_int64 hash_row = 0;
// hash_row = 0, otherwise deadlock
for null on the first position
_int64 traversed_rows = 0;
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "hash
join traverse");
while(mit.IsValid()) {
if(m_conn.killed())
throw
KilledRCException();
@@ -247,6 +249,7 @@
int no_of_matching_rows;
MIIterator mit(mind, matched_dims);
MIDummyIterator combined_mit(mind);
// a combined iterator for checking non-hashed
conditions, if any
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "hash
join tuples");
while(mit.IsValid()) {
if(m_conn.killed())
throw
KilledRCException();
diff -ur infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerLoop.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerLoop.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerLoop.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerLoop.cpp
2009-09-18 12:12:46.056444175 +0300
@@ -34,6 +34,8 @@
int cur_dim1, cur_dim2;
int attr1, attr2;
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "loop
join");
+
ParseDescriptor( desc[0], cur_t1, cur_t2, cur_dim1,
cur_dim2, attr1, attr2, loc_op );
//////////////////////////////////////////////////////////////////////////////////
diff -ur infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerSort.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerSort.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/JoinerSort.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/JoinerSort.cpp
2009-09-18 15:32:01.921256944 +0300
@@ -29,6 +29,7 @@
VirtualColumn *vc2 = desc[0].val1.vc;
dim1 = vc1->GetDim();
dim2 = vc2->GetDim();
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "sort
join");
// The only supported cases (for now):
if(dim1 == -1 || dim2 == -1 ||
// one-dim only
mind->GetFilter(dim1) == NULL
||
@@ -128,6 +129,8 @@
s1.Lock();
s2.Lock();
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "sort
join apply");
+
MINewContents new_mind(mind);
new_mind.SetDimension(dim1);
new_mind.SetDimension(dim2);
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/MIRoughSorter.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/MIRoughSorter.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/MIRoughSorter.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/MIRoughSorter.cpp
2009-09-16 23:17:07.487972507 +0300
@@ -77,6 +77,7 @@
///////////////////////// the main sorting loop
through bigblocks /////////
if(sorting_needed) {
+
thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(), "sorting
roughly");
rccontrol.lock(mind->m_conn->GetThreadID()) << "Sorting
roughly multiindex..." << unlock;
_int64 start_tuple = 0;
_int64 stop_tuple = 0;
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/MIUpdatingIterator.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/MIUpdatingIterator.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/MIUpdatingIterator.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/MIUpdatingIterator.cpp
2009-09-18 15:12:03.293257577 +0300
@@ -116,6 +116,7 @@
{
if(!changed)
return;
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"commit");
if(one_dim_filter) {
one_dim_filter->Commit(); //
working directly on multiindex filer (special case)
mind->UpdateNoTuples();
diff -ur infobright-3.2-x86_64src/src/storage/brighthouse/core/MultiSorter.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/MultiSorter.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/MultiSorter.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/MultiSorter.cpp
2009-09-18 11:31:50.202579444 +0300
@@ -372,6 +372,7 @@
}
else
rccontrol.lock(m_conn->GetThreadID()) << "Sorting " <<
no_obj << " rows..." << unlock;
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"sorting");
if(max_rate < cur_rate)
max_rate = cur_rate;
int byte_ind = 4;
// no. of bytes to encode row index (4 or 8)
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/Query_exeq_low.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/Query_exeq_low.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/Query_exeq_low.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/Query_exeq_low.cpp
2009-09-18 11:59:48.065253205 +0300
@@ -553,6 +553,7 @@
////////////////////////////////////////////////////////////////////////
if(desc.size() < 1)
return;
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"preparing");
DelayWhereConditions(desc);
SyntacticalDescriptorListPreprocessing(desc, mind,
table);
@@ -619,6 +620,8 @@
return;
}
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"executing");
+
///////////////// Apply all one-dimensional filters
(after where, i.e. without outer joins)
for(uint i = 0; i < desc.size(); i++)
if(!desc[i].done &&
desc[i].IsInner() && !desc[i].IsType_Join() &&
!desc[i].IsDelayed()) {
@@ -655,6 +658,7 @@
}
/////////////////////////////////////////////////////////////////////////////////////
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"joining");
DescriptorJoinOrdering(desc, mind);
///// descriptor display for joins
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/Query_optimize_RS.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/Query_optimize_RS.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/Query_optimize_RS.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/Query_optimize_RS.cpp
2009-09-18 11:27:55.040256644 +0300
@@ -104,6 +104,7 @@
MultiIndex &mind,
vector<Descriptor> &desc)
{
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"evaluating P2P");
bool is_nonempty = true;
// init by previous values of mind (if any
nontrivial)
for(int i = 0; i < mind.NoDimensions(); i++)
{
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/RCEngine_results.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/RCEngine_results.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/RCEngine_results.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/RCEngine_results.cpp
2009-09-16 23:02:02.200221840 +0300
@@ -37,7 +37,7 @@
void RCEngine::SendResults(na::DataSource* exectree, THD* thd,
select_result *res, List<Item> &fields)
{
int error = 0;
- thd->proc_info="Sending data";
+ thd_proc_info(thd,"Sending data");
DBUG_PRINT("info", ("%s", thd->proc_info));
res->send_fields(fields, Protocol::SEND_NUM_ROWS |
Protocol::SEND_EOF);
@@ -136,7 +136,7 @@
void RCEngine::SendResults(JustATable& results, THD* thd,
select_result *res, List<Item> &fields, ConnectionInfo *conn)
{
int error = 0;
- thd->proc_info="Sending data";
+ thd_proc_info(thd,"Sending data");
DBUG_PRINT("info", ("%s", thd->proc_info));
res->send_fields(fields, Protocol::SEND_NUM_ROWS |
Protocol::SEND_EOF);
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/RoughJoinWatcher.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/RoughJoinWatcher.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/RoughJoinWatcher.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/RoughJoinWatcher.cpp
2009-09-16 23:18:25.821225774 +0300
@@ -184,6 +184,7 @@
// - after
checking all the result:
//
- if still potentially_excluded => set as non-intersecting and up to
date
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"updating P2P");
rccontrol.lock(mind.m_conn->GetThreadID()) <<
"Updating P2P..." << unlock;
anything_to_update = false;
_int64 pairs_already_updated = 0;
diff -ur
infobright-3.2-x86_64src/src/storage/brighthouse/core/TempTable_aggregate.cpp
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/TempTable_aggregate.cpp
---
infobright-3.2-x86_64src/src/storage/brighthouse/core/TempTable_aggregate.cpp
2009-08-26 21:26:43.000000000 +0300
+++
infobright-3.2-x86_64src.new/src/storage/brighthouse/core/TempTable_aggregate.cpp
2009-09-18 11:30:24.149255211 +0300
@@ -65,6 +65,7 @@
::Filter tuple_left(mit.NoTuples());
tuple_left.Set();
gbw.SetDistinctTuples(tuple_left.NoObj());
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"aggregating");
do {
if(rccontrol.isOn()) {
if(upper_approx_of_groups == 1)
@@ -222,6 +223,7 @@
void TempTable::MultiDimensionalDistinctScan(GroupByWrapper& gbw,
DimensionVector &dims)
{
MEASURE_FET("TempTable::MultiDimensionalDistinctScan(GroupByWrapper&
gbw)");
+ thd_proc_info(&ConnectionInfoOnTLS.Get().Thd(),
"Distinct scan");
while(gbw.AnyOmittedByDistinct()) {
/////////// any distincts omitted? => another pass needed
///// Some displays
_int64 max_size_for_display =
0;
Comments
looks interesting - just about to try infobright myself. we currently run the percona built mysql binaries - is it possible to use infobright as a plugin? is innodb available on the infobright version?
I don't think the storage engine would build as a plugin, I believe they've had to change parts of the upper MySQL layers to inject the columnar execution model into the system. The server does not include InnoDB, and I have not tried installing InnoDB plugin nor building a server with InnoDB included.
Even if the above were possible, I would still say you're better of putting the analytics-oriented tables into their own instance or preferably an entirely separate machine from the InnoDB server, in particular if you're concerned enough about performance to run the Percona builds. I would recommend this even if you were using MyISAM as the table engine for the analytics tables. It's really not the same database system, even though the same clients work, because the workload and with it, the correct server tuning is entirely different.
Some early notes on query performance between Infobright 3.1.1 and 3.2: