Call us today: 0845 257 0137

Teradata, Netezza, Greenplum, big data - hints, tips and tales.

We all now massively parallel processing (MPP) databases like nothing more than plain old batch SQL scripts to chew on, right? Where would we be without clients like Teradata BTEQ, Netezza nzsql or Greenplum psql? Lost, probably.

Parallel databases are quite picky – they like to run things in parallel. Those complicated procedural programming languages don’t lend themselves to parallel execution by our favourite MPP databases. The upshot? No pesky procedural programming for us Teradata, Netezza or Greenplum jockeys, no sir! Java? Nope. C? Nope. Cobol? Well, maybe.

The good thing about batch SQL scripts is that they’re easy to write, is it not? Developers in the IT department and business users alike can all develop scripts that can be run once, many times, or even implemented into production to run on the schedule. Happy days.

Let’s face it there are only 4 basic commands to worry about:

  • select – get me stuff from the database
  • insert – add stuff to the database
  • update – change stuff in the database
  • delete – get rid of stuff from the database

There’s even an acronym to describe these commands collectively: CRUD

So, MPP databases *really* like SQL *and* its easy to write, so what’s the problem?

Well, the problem is exactly that – SQL is easy to write. It’s also very easy to write badly. And here’s the secret : poorly formatted SQL runs slower than well formatted SQL. In a nutshell, neat SQL is fast, really neat SQL (RNSQL) is even faster. When we are dealing with terabytes of data in a single table and queries that span tens or even hundreds of processing nodes, optimal code is really important.

So, what’s the take-away message? Easy – formatting matters a lot. Always aim to feed your database RNSQL. It deserves it.

Here’s a quick video to illustrate the point about RNSQL.

Enjoy!

 

So, what’s the connection between Rainstor in the staid world of data management and Her Majesty’s finest secret agent?

Allow us to explain, if you will. It all started at last year’s Teradata EMEA Universe event in Dublin. Team VLDB happened to be walking around the expo, as you do, trying to find out what’s new and shiny in the world of big data management.

Over towards the corner of the hall, looking like they could do with a vistior or two, we bumped into Anthony (sales) and Deirdre (marketing) from Rainstor. “Tell us your story” we opened with, as per usual when we don’t know a technology particularly well.

Anthony and Deirdre duly told us the Rainstor story, including the background to the product. It turns out to have been originally developed in the UK by GCHQ. Being gentlemen of a certain age, this gave us all the latitude we needed for the (rather tenuous) 007 connection. And why not indeed!

We listened very attentively to the Rainstor story as told by Anthony and Deirdre. It turned out to be such an interesting product that the folks from Rainstor had to complete the story over a Guinness or three and fish ‘n’ chips in a very welcoming Dublin hostelry that night. It’s a dirty job but someone’s got to do it.

Anyway, since meeting the folks from Rainstor a year ago we’ve managed to pull off a couple of very successful finance projects together. It seems that the combination of cost, ease of deployment, SQL compatibility and highly skilled PS partners (naturally) makes for a very well received big data solution.

We could tell you more about about our Rainstor projects, but then we’d have to kill you, in true 007 style. Well maybe not.

On a more serious note, there is more about Rainstor at Wikipedia here.

Teradata Top 10 Tips #2

#2 – Keep It Simple

We all know that Teradata SQL is easy to write, right? Sadly, bad SQL is even easier to write. By ‘bad’ SQL we typically mean SQL that generates an inefficient query plan.

No matter how we submit an SQL query, whether we use SQL Assistant or BTEQ or some other SQL tool, we are telling the Teradata system what we want.

The Teradata optimiser has the often unenviable task of turning the submitted SQL query into an execution plan. This describes how the Teradata system will physically execute the SQL query against the database.

To see the execution plan for any query simply prefix the SQL with the ‘EXPLAIN’ modifier and re-submit to Teradata. The response is how Teradata will execute the query. But you knew that, right?

It can be quite a challenge for the optimiser to turn a piece of arbitrary SQL into an efficient execution plan. The Teradata optimiser is ‘cost based’ and aims to generate the most efficient query plan, as you might expect.

Although the Teradata optimiser is undoubtedly one of the best in the market, sometimes even Teradata generates bad query plans. The difference between a ‘good’ (efficient) plan and a ‘bad’ (inefficient) plan can be the difference between a query that takes seconds/minutes and one that takes hours…or even days! A bad plan will simply consume resources and keep running until it hits a limit of some sorts – often spool.

A lot of what the Teradata optimiser has to do is to decide the sequence of table joins in the query. This is largely a function of the number of tables referenced in the query.

Consider the following relationship between the number of tables in a query and the number of ways in which they can be joined:

Number of Tables Possible Join Orders
4 24
5 120
6 720
7 5,040
8 40,320
9 362,880
10 3,628,800

The number of possible join orders is the factorial of the number or tables in the query. As more tables are added, even one at a time, the number of possible join orders that the optimiser has to consider soon becomes very large.

We can help the optimsier in its quest to generate efficient query plans by not submitting massive SQL queries that join lots and lots of tables together just because we can – you know who you are ;-)

With this in mind, may we suggest the following:

•    DO NOT join more than 6 fact (non-reference) tables together in a single SQL statement
•    DO use work tables to store intermediate results – break that massive join into multiple smaller joins with intermediate results stored in tables…it’s what the optimiser does anyway under the covers via spool tables

It is not big or clever to write monster SQL queries, just because we can, and ‘let the optimiser figure it out’. That will only work up to a point, after which the chances of an inefficient plan become too high.

As the man said:

“Simplicity is the ultimate sophistication”
Leonardo Da Vinci (1452-1519)

Who are we to argue?

,

As reported on El Reg and elsewhere, Amazon announced the general availability of ‘Redshift’ a few days ago:

http://aws.amazon.com/redshift/

What is Amazon Redshift then?

Redshift is a licensed version of the ParAccel Analytic Database (PADB) that Amazon has deployed on the Amazon Web Services (AWS) public cloud infrastructure.

The main Redshift features are as follows:

Perhaps the most newsworthy point is Redshift’s claimed $1,000/TB/year cost with no upfront capital expenditure.

So, Amazon’s Redshift is a public-cloud based, SQL-compliant, scalable MPP database that looks cheap-as-chips…what’s not to like?

Well, funny you should ask…after a quick read through the Redshift documentation the ‘gotchas’ we’ve spotted so far are as follows:

  • Data loading  is only supported from Amazon sources such as S3. External data not already in AWS must be loaded to AWS before it can be loaded to Redshift. It looks like Amazon have decided not to licence to PADB high-speed loader, which might be why existing load strategies must be adopted to load external data to AWS, and from AWS the data can then be loaded locally to Redshift as a two-step process
  • The distribution key is a single column only. This may turn out to be highly restrictive unless it can be mitigated by column storage and/or fine-grained control over sort sequences within tables. Teradata enforces no effective limit on the number of columns used to hash the data, whereas Netezza allows up to 4, which is a sensible upper limit if one is to be imposed.
  • An aggregation server is used for final aggregation processing. For large clusters and/or queries with large answer sets this will almost certainly be a performance bottleneck. MPP and ‘shared nothing’ are not the same thing when a single, shared server is used for final aggregation of intermediate results following local aggreagtion across the nodes.

So, as much as we welcome affordable, cloud-based, MPP databases based on PostgreSQL, we can’t help but feel that the above ‘gotchas’ will bite some folks where it hurts. The aggregation server especially is something we’ve seen as a real show-stopper out in the field over several years.

Shared-nothing MPP databases that claim linear scalability simply can’t be architected around an aggregation server…as Datallegro and Exadata have discovered.

Teradata Top 10 Tips #1

A few of us Teradata types at VLDB were recently bemoaning the fact that we see the same poor behaviours year after year, decade after decade, site after site. This is partly an education issue, and partly a policing issue – we can’t be expected to interject every time somebody does something we don’t approve of can we?

Why does this stuff matter? Well, poor Teradata behaviour is a performance killer, plain and simple.

So, in true ‘Idiots Guides’ style, we decided to try and come up with a ‘Teradata Top 10 Tips’ for folks to follow.

So, without further ado, here’s #1 in the series:

#1 – Take *Lots* of Care with the Teradata Primary Index (PI)

Hopefully most Teradata users are aware that the primary index (PI) is used to distribute the data across the processors (AMPs) in the system. It has always been thus. Sensible data distribution in any massively parallel processing (MPP) system like Teradata is critical…or even more important!

Rather than duplicate what has been written elsewhere about the Teradata primary index, we thought it more useful to share our own easy-to-follow thoughts on the Teradata PI.

1 – ALWAYS explicitly code a PI when a table is created, even though you don’t have to – this way you will get what you intended.

2 – DO NOT let the system choose a PI – it may not be what you intended (see above).

3 – ALWAYS code the PI column(s) first in the column list when a table is created – this means the PI might get used as a PI when the table is used as input into a ‘create table as’ (CTAS) operation which might save the day. Also, if your table is used as a template by somebody that violates #1 above, this might save their day, and then they’ll owe you a favour.

4 – ALWAYS code NOT NULL for all PI columns – the optimiser has less work to do when there is no need to allow for nulls in a PI column, plus you’d never intentionally choose a nullable column for a PI, right? Of course not!

5 – DO NOT change the data type of PI columns when copying table definitions (DDL) from the production schema. You may decide you know better then the DBA or data modeller, it has been known. However, if you change that pesky INTEGER column to a CHAR and it’s part of a PI, without knowing it you’ve probably created a table with an entirely different distribution. That’s right, data distrbution is affected by data type – be warned.

6 – DO make the PI unique (UPI) rather than non-unique (NUPI) if possible – this acts as a safety check on the PI values being loaded and can catch unintentional duplicate PI values…with no coding to do, which is nice.

7 – as a guide, DO NOT use more than 4 columns in a PI unless there is a very good reason to do so…which is rarely the case. We’ve seen as many as 15 columns used in a PI, but let’s save that story for another day.

8 – DO NOT create tables without a PI (NOPI tables) to avoid thinking about a sensible PI – NOPI tables are probably not a good thing for end users in most cases.

So there you go – 8 Teradata primary index (PI) ‘tipettes’ for the price of 1. This should send a clear signal that getting the Teradata PI right is *very* important.

Other MPP systems like Netezza or Greenplum, or even Hadoop, may call the PI something like a distribution key or hash key, but the same principle remains: data distribution is *very* important for parallel processing systems.

For those readers that are interested, there’s more on the Teradata PI here with an article on Teradata indexes: http://blog.vldbsolutions.com/teradata-indexes/

VLDB Digital

So, we’ve survived a seemingly endless round of office parties, Xmas and New Year. Just.

Festive celebrations then gave way to birthdays for our very own Dave Agnew and Paul Johnson.

Last week’s combined business/ski trip to Geneva and Chamonix must surely mark the end of the fun and a return to serious matter of 2013…boo!!!

We finished 2012 on a major high over in the digital marketing division of VLDB Group. Some of our customers ‘complained’ that we had driven too much web traffic to them, and that the spike in orders was impacting on their office party schedule. Ho-hum, there’s no pleasing some folks.

Anyway, we decided we’d better publicise the VLDB Digital team’s fine work with a new web site, so without further ado, ladies and gents may we introduce you to the new VLDB Digital web site:

http://www.vldbdigital.com

Hope you all have a great 2013.

We certainly plan to do so :-)

Wednesday’s Teradata User Group (TUG) in London felt a bit like a reunion for anyone who’s ever worked at Lloyds in Manchester. They were out in droves – you know who you are – and it was great to catch up with them all. No prizes for guessing who turned up in jeans, t-shirt and trainers…you can take the boy out of Brum etc…

As expected, the team at Teradata UK put on a great selection of speakers, at a great venue just off Trafalgar Square. So, on to the event…

After milling around with the Lloyds crowd over coffee, Trevor Jukes (WH Smith), the current Teradata User Group (TUG) chairman, opened proceedings. After a brief update from Teradata UK head-honcho Chris Armstrong, it was on with the sessions.

First off, Tom Fastner (ebay) walked us through the 3 system setup at ebay that seems to be the basis for Teradata’s vision of a ‘Unified Data Architecture’. This consists of a traditional Teradata enterprise-class EDW, a cheaper Teradata or Aster appliance for ‘discovery’, and a very cheap Hadoop stack for sorting the signal from the noise amongst less well structured data, such as web logs.

What looks like a significant deviation from the ‘grand central EDW’ theme that Teradata has been promoting for many years is entirely sensible, and justifiable…even if ‘unified’ is perhaps not the best way to describe lots of boxes, data flows, and inevitable data duplication. The main driver for the move away from the single EDW platform is the famous “Big Data three Vs”  – the volume, variety and velocity of the data being produce by the ever-increasing digitisation of our lives in this interwebs-connected world. I wonder if Tim Berners–Lee was an early stage Teradata investor?

The user playpens/sandpits at ebay are still known as ‘virtual data marts’, which I would describe as ‘physical virtual data marts’…but that’s another story ;-)

Following on from Tom, Jeff Peckham (Wells Fargo) talked about the Aster POC that Wells have been running. Jeff has been in data warehousing a long time…in fact he was the DBA at Bank Of America when I was on my very first freelance engagement way back in ’92 – anyone else remember accessing Teradata via ITEQ on an IBM machine running green screens on VM/CMS? Yuk!

It sounded like the biggest challenge Jeff faced was to gain access to the data centre to land the Aster box. Banks eh! The theme that emerged, and continued later in the day, was that the 50 or so SQL-MapReduce functions that ship with Aster enabled time series analysis/graph analysis etc. that simply wouldn’t be feasible using SQL, mainly due to the iterations and brute force processing that would be required. Anyway, it sounds like the Aster POC at Wells Fargo was a great success.

The Teradata CTO, Stephen Brobst, followed on from Jeff. I’ve sat and listened to Stephen more times than I can remember. He’s always got something interesting to say and can be relied on to keep the audience engaged, and not just due to his choice of shirts! The one he wore in London was disappointingly subdued, but never mind.

The main theme of Stephen’s talk was how the ‘old’ (SQL/database) and ‘new’ (noSQL/Hadoop) worlds can and should co-exist, building on the earlier message on the same lines from ebay and Wells Fargo. The way Stephen describes the Silicon Valley stand-off between the ‘database old fogies’ and ‘Hadoop dotcommers’ is truly hilarious. No prizes for guessing that Stephen supports both camps, as we would have expected.

We’ve long held the view that the MapReduce paradigm can be summarised as ‘parallel processing for Java programmers’. That’s no bad thing given that the software is free and can be run on ‘cheap tin’. Of all the enterprise features MapReduce lacks, the key one for us is the SQL query optimiser we take for granted in systems like Teradata.

Stephen nailed it for us when he told the audience: “With MapReduce, you are the optimiser”.

That folks, for us, is why MapReduce/Hadoop will never go truly mainstream in its current form. If analytic applications have to be hand-coded and optimised to run on a cluster of servers using non-declarative languages like Java, it will remain the preserve of the dotcommers who created it in the first place. They’re not wrong in any way. It’s just *way* too hard for mere mortals.

The simple fact is, SQL is here to stay in the analytic mainstream. SQL developers on the whole simply don’t care what goes on under the covers. How the system runs the query is the optimiser’s job, and so it will remain. The power of a good SQL optimiser should become increasingly obvious to all over the next few years.

Having recently followed Professor Marcus Du Sautoy onto the stage at a speaking engagement, I didn’t envy Martin Willcox following on from Stephen. Poor chap. Martin is Teradata’s EMEA head of platform and solution marketing. His unofficial role appears to be Stephen’s wingman when he’s over this side of the pond. Nice work if you can get it etc…

Martin has recently been discussing the ins and outs of ‘in memory databases’ with the folks over at SAP, on which we’ve commented.

Martin gave a pretty detailed insight into Teradata’s development roadmap, with no commitment to any timescales – more a tease than a schedule, if you will. Very interesting all the same. Until we know which bits are NDA-free we’ll keep it zipped for now.

The final session we attended was Mike Whelan’s live Aster demo. Everyone knows live demos are risky, don’t they? Well, sadly, due to technology ‘challenges’, all Mike managed to demonstrate was what all sales folks know already – it always works in Powerpoint!!! Better luck next time Mike.

In a similar vein to the Wells Fargo POC, the main focus of Mike’s talk/demo was the Hadoop ‘npath’ capability that is so hard to achieve in SQL. When asked which other functions are also useful, Mike quipped ‘all of them’. Priceless.

As this was a techy session, a show of hands revealed most of the audience to be SQL developers. Only a couple admitted to also knowing Java. No surprises there at all. What Mike hit upon is the other side of the Hadoop adoption issue – Java programmers are rarely developing analytic applications, and few have experience of developing on large-scale clusters. Java + analytics + clusters is a very small set of folks indeed. Analytic folks and tools know SQL, and mainly only SQL, plain and simple.

Anyway, it’s beer o’clock and that’s more than enough for a blog article…I’ll finish off by thanking everyone at Teradata and all of the speakers for a great event. Well done to all.

, , ,

So, there we were minding our own business the other day…when an update on LinkedIn caught our attention (no, really!).

Said update referred to a blog post by our good friend Martin Willcox at Teradata:

http://blogs.teradata.com/emea/lock-em-all-up-and-throw-away-the-key/

It seems Martin was responding to a couple of blog posts from SAP’s Rob Klopp, who we know from his EMC Greenplum days:

http://www.saphana.com/community/blogs/blog/2012/09/24/hana-vs-teradata-part-1

http://www.saphana.com/community/blogs/blog/2012/09/26/hana-vs-teradata-part-2

Now we know Martin and Rob are both ‘just doing their jobs’ and we have no issue with that (why would we), but, needless to say, we felt compelled to comment:

http://blogs.teradata.com/emea/lock-em-all-up-and-throw-away-the-key/#comments

It seems this is ‘in memory’ debate is quite topical, as Curt Monash recently posted the following on his DBMS2 blog:

http://www.dbms2.com/2012/11/05/do-you-need-an-analytic-rdbms/

“OK, but can we overprovision the RAM by so much that suboptimal performance doesn’t matter? My guess is “Not any time soon” — because efficiency is always a good thing, databases will always grow, and RAM will never be free.”

Wise words indeed from Curt :-)

 

 

Big Data Letter in Wired Magazine

Never one to shy away from providing a comment or two, or three, yours truly managed to get a letter printed in the September edition of Wired UK magazine. The subject, predictably, was ‘big data’. This was a follow-up to an article contained in the August edition covering the ‘Exabyte Revolution’. Shame on all those that missed it!

Imagine how Twitter’s very own Jack Dorsey must have felt to have his photo on the same page as a letter from a ‘big data’ geek…

That’s right…not very excited at all ;-)

Anyway, here’s the letter in all its ‘glory’:

The editor messed up the bit about ebay, but never mind.

Let’s rawk…

 

Statistics collection on databases is a topic that many will feel has been done to death. Although this is true, there still seems to be a lot of misunderstanding as to the benefit, or otherwise, that statistics deliver, especially when put in context of the cost of collection/maintenance.

First of all, let’s consider why statistics are important in a world in which data is physically distributed in some way, as is the case with Massively Parallel Processing (MPP) systems.

In a classic general purpose Symmetric Multi-Processor (SMP) based DBMS scenario (think Oracle or SQL Server) statistics may help the query optimiser to understand that table A is a large fact table, and that table B is a small dimension table. This kind of information is obviously useful when it comes to generating a sensible query plan.

In the MPP world things are a bit more complicated. The optimiser benefits from not just understanding the relative table sizes, but also from understanding the distribution of each across the units of parallelism. Knowing that one/more of the tables is not evenly distributed i.e. ‘skewed’ can have a significant impact on the query plan. Choices such as table re-distribution or duplication that are available to MPP optimisers simply don’t exist in the classic SMP world.

Big MPP systems such as Teradata, Netezza and Greenplum are deployed for various reasons such as processing horsepower, scalability, capacity etc. These capabilities are not in themselves a means to an end. The end is the ability to run big gnarly queries against big gnarly datasets.

Given that this requirement is delivered via set-based SQL, query optimisation is of prime importance. Through the use of SQL, we tell the system what we want (the question) but not how to do it (the execution plan). Turning the question into a plan is the query optimiser’s job. And it’s not always easy. A good plan may run in seconds/minutes, whereas a bad plan that aims to answer the same question may run for hours, or even days in extreme cases.

So, as a general guide, MPP databases tend to benefit from statistics more than might be apparent. Similarly, a lack of statistics can lead to very bad plans in some cases.

OK, so we should just collect statistics on all the things we’re told to collect on then? You would think so, but I’d like to counter this with two thoughts.

Firstly, collecting statistics is a CPU-intensive operation, and CPU is an expensive and finite resource. On Teradata systems specifically, collecting statistics can be a very large consumer of CPU.

Secondly, no amount of statistics can improve on a good plan. That’s the key point that’s lost on a lot of folks. A good plan is generated in a high proportion of cases in which no statistics are present, at least it is by the Teradata optimiser. This is something we’ve taken for granted for decades, often without realising.

Random sampling and evenly distributed data are often all that’s needed for the optimiser to do its job and generate an efficient plan. It matters not a jot that the timings and/or row counts are less accurate, or that there is ‘low confidence’ relating to some of the steps – a good plan is still a good plan.

Statistics should be seen as ‘expensive medicine’. The expense is the CPU resource required to collect/maintain the statistics. The analogy with medicine is that there needs to be an illness (problem) to fix. Only a bad plan should be considered an illness worth incurring the cost of the expensive medicine (CPU cycles).

And therein lies part of the problem. The explain plan needs to be analysed, understood and categorised as ‘good’ or ‘bad’ before a sensible decision can be made as to whether statistics would help.

It’s all too easy to look at the timings and/or row counts and decide that a plan is ‘bad’. It’s often the wrong conclusion. The numbers in explain plans should be taken with a pinch of salt, to say the least, no matter what the technology vendors would have us believe. The only real truth in the plan is the sequence of events and the join/aggregate/sort/scan techniques used to implement each operation. The numbers are an informed guess. I’ve been largely ignoring them for 20 years.

As an aside, predicted elapse times expressed in terms of wall clock time are always subject to the same issue – how can we know the future system state when we run the query in anger? We can’t. Simple really.

So, how do we decide if a plan is good or bad? It boils down to two simple choices – learn how to manually analyse explain plan text, or buy a tool to do the job.

For my sins, I’ve been analysing Teradata explain plans with a pen and paper since the late 1980’s and they haven’t changed much. They’re pretty easy to follow and written in plain English: “1) First, we…2) Next, we…”. The key is knowing what to look for, which is outside the scope of this article.

For those that would rather use a tool to do the job, there are several choices. First, of all there is Teradata’s own ‘Visual Explain’ (VE), which does what it says on the tin – it presents a graphical representation of the explain plan text.

A far more useful tool is ‘Prism’ from Ward Analytics. Prism will identify issues with explain plans and offer suggested fixes. Back of the net!

In summary:

•    statistics are ‘expensive medicine’ and should be viewed and used accordingly
•    a good plan is a good plan with or without statistics
•    the sequence of events, parallelisation of tasks and physical operations used make a plan good or bad, not the predicted numbers
•    the numbers in explain plans – predicted row counts and elapse/CPU times – should not be regarded as accurate
•    learn how to interpret explain plans (or buy a tool to make life easier) to decide if they’re OK

Despite all of this rambling, always remember…on MPP systems it is possible to get a bad plan, with awful performance, for which the only remedy is often, you guessed it – appropriate and up-to-date statistics.

In such cases the ‘expensive medicine’ is the correct cure.

, , , ,