Tag Archives: query

02 Mar

No More Spool Space Teradata Query Solution

Have you ever come across a situation called NO MORE SPOOL SPACE?  My friend does and hence I studied a bit about Teradata.

The query:
SELECT DISTINCT fieldname FROM tablename;

The error:

Correct, the problem wasn’t related to MySQL, that was for something called Teradata.

Initially, I could see that query should be using temporary space (from name SPOOL SPACE I guessed) to get distinct data – may be it’s a huge table or something.
So I think one has to look into temporary table space’s size – they call it SPOOL SPACE.

Next is Google’s turn to explain me more, I Googled with appropriate keywords and got following understandings:
Spool is temporary disk space used to hold intermediate rows during query processing, and to hold the rows in the answer set of a transaction.

Spool space is allocated to a user as a total amount available for that user, however this total amount is spread across all AMPS. This means if you’re allocated 200G of spool and you’re on a system with 24 AMPS, you’re allocated ~8.3G of spool on each AMP. If any AMP exceeds its allocated amount of spool during a transaction you submit, your query is cancelled with error: NO MORE SPOOL SPACE.

Common reasons for “No More Spool Space Error”:

1) All the rows in the query are being processed on a few amps creating a “hot amp” situation. In this case, just a few amps are racking up spool while the others sit comparatively idle. This is caused when the tables in the query are missing stats, have been improperly PI’d, or are otherwise “untuned”.

2) The query is excluding critical join criteria resulting in Cartesian products.

3) You just don’t have enough space to perform the query.
From other readings I suggested one query change:

SELECT fieldname FROM tablename GROUP BY fieldname;

The reason behind above query is:
Teradata uses two different methods to produce the result sets of above queries, However the performance varies dramatically depending on the data.

SELECT DISTINCT is designed for data sets that are nearly unique to begin with and works by sorting the entire intermediate spool file and discarding duplicate values.

SELECT/GROUP BY is designed for data sets that have relatively few unique values and works by performing an AMP local grouping operation and then merging the partial result sets for final processing.

Few other steps (even) I can suggest to solve such problem are:

– Explain to see execution plan to have an idea.
– Check table definition for indexes: Show table table-name;
– CREATE INDEX indexName (field-name) on table-name;

Finally, if nothing happens contact your DBA or the “right person” 🙂

I hope this will help someone, somewhere,  someday.

If you know that I’ve misunderstood something please comment and give respective links – I should be correcting it asap.

08 Feb

Oracle query-eliminate duplicate but one using rowid

Yesterday a friend came across an oracle query problem:

Consider below table:

cid cname.... cdata
1 x xxxx
1 x xxxx ..
2 xzzz fjnd
3 a evddd

Now the problem was cid is repeated and cdata column is having some extra characters but data is considered same (eg. xxxx & xxxx..).
so when he does
SELECT cid,cname....cdata FROM TABLENAME ....;

He kept on getting duplicated records which intern he wanted to have only one per duplicated cid.
It’s not by the way of table structure but by the way of query he wanted to have results.

I suggested him:
SELECT cid,cname....cdata FROM TABLENAME GROUP BY cid;

As far as I know this works with MySQL, but it gave error as I came to know that I have to include all SELECTed variables in GROUP BY clause.
This is Oracle 🙂

Well later I came back to my place and recollected something called ROWID of Oracle, may be that can help!!

Then I created similar table in MySQL to “stimulate” Oracle rowID:

rowId cid cname .... cdata
1 1 x xxxx
2 1 x xxxx ..
3 2 xzzz fjnd
4 3 a evddd

And finally I got the solution as:

select cid,cname….cdata from tablename where rowID in (select min(rowID) from tablename group by cid);

So finally I decided to give it a try on Oracle and it worked 🙂

select * from TABLENAME WHERE rowid in (select min(rowid) from TABLENAME group by DUPLICATECOLUMN);

-- Kedar Vaijanapurkar --