Tuesday, May 26, 2015

[Python Script] Get all tables for an schema in Redshift

Every now and then I need to write an script to fetch data from Redshift. This is just a boilerplate code, which takes care of connection creation and running a query on Redshift. I am using PG8000 driver, as its created in pure python.

Code given below:
import json
import pg8000 as dbapi
from pprint import pprint
def getconnection(database,host,port,user,password):
conn= None
try:
conn=dbapi.connect(database=database,host=host, port=port,\
user=user,password=password,ssl=True)
except Exception as err:
print(err)
return conn
def runquery(conn,query):
"""
Just run a query given a connection
"""
curr=conn.cursor()
curr.execute(query)
for row in curr.fetchall():
pprint(row)
return None
if __name__ =='__main__':
config={
"database": "databasename",
"host": "myhost.redshift.amazonaws.com",
"port": 8192,
"user": 'username',
"password": 'password'
}
conn = getconnection(config['database'],config['host'],\
config['port'],config['user'],config['password'])
runquery(conn,\
'''
select datname, nspname, relname
from pg_class, pg_namespace, pg_database
where pg_namespace.oid = relnamespace
and datname ='dwrsg010'
and nspname not in
('information_schema','pg_catalog','pg_toast')
group by datname, nspname, relname
order by datname, nspname, relname
;
''' )