我正在寻找一种方法,为我在Postgres中所有的表找到行数。我知道我可以一次做一张表:
SELECT count(*) FROM table_name;
但我想看看所有表的行数,然后按它排序,以了解所有表的大小。
我正在寻找一种方法,为我在Postgres中所有的表找到行数。我知道我可以一次做一张表:
SELECT count(*) FROM table_name;
但我想看看所有表的行数,然后按它排序,以了解所有表的大小。
当前回答
对于那些试图评估他们需要哪一个Heroku计划,又不能等待Heroku的慢行计数器刷新的人来说,一个简单实用的答案是:
基本上你想在psql中运行\dt,将结果复制到你最喜欢的文本编辑器中(它看起来像这样:
public | auth_group | table | axrsosvelhutvw
public | auth_group_permissions | table | axrsosvelhutvw
public | auth_permission | table | axrsosvelhutvw
public | auth_user | table | axrsosvelhutvw
public | auth_user_groups | table | axrsosvelhutvw
public | auth_user_user_permissions | table | axrsosvelhutvw
public | background_task | table | axrsosvelhutvw
public | django_admin_log | table | axrsosvelhutvw
public | django_content_type | table | axrsosvelhutvw
public | django_migrations | table | axrsosvelhutvw
public | django_session | table | axrsosvelhutvw
public | exercises_assignment | table | axrsosvelhutvw
),然后运行regex搜索并替换,如下所示:
^[^|]*\|\s+([^|]*?)\s+\| table \|.*$
to:
select '\1', count(*) from \1 union/g
这将会给你一个非常类似的结果:
select 'auth_group', count(*) from auth_group union
select 'auth_group_permissions', count(*) from auth_group_permissions union
select 'auth_permission', count(*) from auth_permission union
select 'auth_user', count(*) from auth_user union
select 'auth_user_groups', count(*) from auth_user_groups union
select 'auth_user_user_permissions', count(*) from auth_user_user_permissions union
select 'background_task', count(*) from background_task union
select 'django_admin_log', count(*) from django_admin_log union
select 'django_content_type', count(*) from django_content_type union
select 'django_migrations', count(*) from django_migrations union
select 'django_session', count(*) from django_session
;
(您需要删除最后一个联合,并手动在末尾添加分号)
在psql中运行它,就完成了。
?column? | count
--------------------------------+-------
auth_group_permissions | 0
auth_user_user_permissions | 0
django_session | 1306
django_content_type | 17
auth_user_groups | 162
django_admin_log | 9106
django_migrations | 19
[..]
其他回答
如果您在psql shell中,使用\gexec允许您执行syed的答案和Aur的答案中描述的语法,而无需在外部文本编辑器中手动编辑。
with x (y) as (
select
'select count(*), '''||
tablename||
''' as "tablename" from '||
tablename||' '
from pg_tables
where schemaname='public'
)
select
string_agg(y,' union all '||chr(10)) || ' order by tablename'
from x \gexec
注意,string_agg()既用于分隔所有语句之间的联合,也用于将分隔的数据箭头粉碎为一个单元,以便传递到缓冲区。
\ gexec 将当前查询缓冲区发送到服务器,然后将查询输出的每一行的每一列(如果有的话)视为要执行的SQL语句。
简单的两步:(注意:不需要改变任何东西-只是复制粘贴) 1. 创建函数
create function
cnt_rows(schema text, tablename text) returns integer
as
$body$
declare
result integer;
query varchar;
begin
query := 'SELECT count(1) FROM ' || schema || '.' || tablename;
execute query into result;
return result;
end;
$body$
language plpgsql;
2. 运行此查询获取所有表的行数
select sum(cnt_rows) as total_no_of_rows from (select
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE') as subq;
或 按表获取行数
select
table_schema,
table_name,
cnt_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE'
order by 3 desc;
下面是一个解决方案,它不需要函数来获得每个表的精确计数:
select table_schema,
table_name,
(xpath('/row/cnt/text()', xml_count))[1]::text::int as row_count
from (
select table_name, table_schema,
query_to_xml(format('select count(*) as cnt from %I.%I', table_schema, table_name), false, true, '') as xml_count
from information_schema.tables
where table_schema = 'public' --<< change here for the schema you want
) t
query_to_xml将运行传递的SQL查询并返回带有结果的XML(该表的行数)。外层xpath()将从该xml中提取计数信息并将其转换为数字
实际上并不需要派生表,但可以使xpath()更容易理解——否则整个query_to_xml()将需要传递给xpath()函数。
这对我很有效
SELECT schemaname,relname,n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;
要获得估计,请参阅格雷格·史密斯的答案。
为了得到确切的数字,到目前为止,其他答案都受到一些问题的困扰,其中一些问题很严重(见下文)。这里有一个版本,希望更好:
CREATE FUNCTION rowcount_all(schema_name text default 'public')
RETURNS table(table_name text, cnt bigint) as
$$
declare
table_name text;
begin
for table_name in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
LOOP
RETURN QUERY EXECUTE format('select cast(%L as text),count(*) from %I.%I',
table_name, schema_name, table_name);
END LOOP;
end
$$ language plpgsql;
它接受模式名作为参数,如果没有给出参数,则接受public。
要使用特定的模式列表或来自查询的列表而不修改函数,可以从查询中调用它,如下所示:
WITH rc(schema_name,tbl) AS (
select s.n,rowcount_all(s.n) from (values ('schema1'),('schema2')) as s(n)
)
SELECT schema_name,(tbl).* FROM rc;
这将生成一个包含模式、表和行计数的3列输出。
下面是这个函数避免的其他答案中的一些问题:
Table and schema names shouldn't be injected into executable SQL without being quoted, either with quote_ident or with the more modern format() function with its %I format string. Otherwise some malicious person may name their table tablename;DROP TABLE other_table which is perfectly valid as a table name. Even without the SQL injection and funny characters problems, table name may exist in variants differing by case. If a table is named ABCD and another one abcd, the SELECT count(*) FROM... must use a quoted name otherwise it will skip ABCD and count abcd twice. The %I of format does this automatically. information_schema.tables lists custom composite types in addition to tables, even when table_type is 'BASE TABLE' (!). As a consequence, we can't iterate oninformation_schema.tables, otherwise we risk having select count(*) from name_of_composite_type and that would fail. OTOH pg_class where relkind='r' should always work fine. The type of COUNT() is bigint, not int. Tables with more than 2.15 billion rows may exist (running a count(*) on them is a bad idea, though). A permanent type need not to be created for a function to return a resultset with several columns. RETURNS TABLE(definition...) is a better alternative.