用MySQL计算中位数最简单(希望不会太慢)的方法是什么?我已经使用AVG(x)来寻找平均值,但我很难找到一个简单的方法来计算中位数。现在,我将所有的行返回到PHP,进行排序,然后选择中间的行,但是肯定有一些简单的方法可以在一个MySQL查询中完成它。

示例数据:

id | val
--------
 1    4
 2    7
 3    2
 4    2
 5    9
 6    8
 7    3

对val排序得到2 2 3 4 7 8 9,因此中位数应该是4,而SELECT AVG(val) == 5。


当前回答

另一个对Velcrow答案的重复,但使用了一个中间表,并利用了用于行编号的变量来获得计数,而不是执行额外的查询来计算它。还开始计数,以便第一行是第0行,以便简单地使用Floor和Ceil选择中位数行。

SELECT Avg(tmp.val) as median_val
    FROM (SELECT inTab.val, @rows := @rows + 1 as rowNum
              FROM data as inTab,  (SELECT @rows := -1) as init
              -- Replace with better where clause or delete
              WHERE 2 > 1
              ORDER BY inTab.val) as tmp
    WHERE tmp.rowNum in (Floor(@rows / 2), Ceil(@rows / 2));

其他回答

MariaDB / MySQL:

SELECT AVG(dd.val) as median_val
FROM (
SELECT d.val, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
  FROM data d, (SELECT @rownum:=0) r
  WHERE d.val is NOT NULL
  -- put some where clause here
  ORDER BY d.val
) as dd
WHERE dd.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) );

Steve Cohen指出,在第一次传递之后,@rownum将包含总行数。这可用于确定中值,因此不需要第二次传递或连接。

此外,AVG(dd.val)和dd.row_number IN(…)用于在有偶数条记录时正确地产生中位数。推理:

SELECT FLOOR((3+1)/2),FLOOR((3+2)/2); -- when total_rows is 3, avg rows 2 and 2
SELECT FLOOR((4+1)/2),FLOOR((4+2)/2); -- when total_rows is 4, avg rows 2 and 3

最后,MariaDB 10.3.3+包含一个MEDIAN函数

最简单和快速的方法来计算中位数在mysql。

select x.col
from   (select lat_n, 
               count(1) over (partition by 'A')        as total_rows, 
               row_number() over (order by col asc) as rank_Order 
        from   station ft) x 
where  x.rank_Order = round(x.total_rows / 2.0, 0) 

让我们创建一个名为numbers的示例表

这个答案是针对mysql数据库的

在postgres Sql中,它简单地使用per_cont函数

创建表数字( num INT, 频率整数 );

在数字表中插入值

插入数字 (7) 0 (1, 1), (2、3), (1) 3 (9,1), (1, 1), (2、3), (1) 3 (9,1);

——select * from numbers

作为递归num_frequency (num,frequency, i) ( 选择num,频率,1 从数字 UNION ALL 选择num,频率,i + 1 从num_frequency num_frequency的地方。I < num_frequency.frequency )

select * (max(当numbers=lower_limit时,则num else null end)/2 +max(当数字=upper_limit时,则num else null end)/2)作为中位数 从( select *, total_number % 2, 情况下 当total_number%2=0时,total_number/2 Else (total_number+1)/2 end as lower_limit, 情况下 当total_number%2=0时,total_number/2+1 其他(total_number + 1) / 2 结束为upper_limit

从( Select *,max(numbers) over() as total_number from ( Select num,row_number() over(按num排序) 作为num_frequency中的数字 b) b) b)

通常,我们不仅需要为整个表计算Median,还需要为与ID相关的聚合计算Median。换句话说,计算表中每个ID的中位数,其中每个ID有许多记录。(良好的性能和工作在许多SQL +修复偶数和赔率的问题,更多关于不同的中值方法的性能https://sqlperformance.com/2012/08/t-sql-queries/median)

SELECT our_id, AVG(1.0 * our_val) as Median
FROM
( SELECT our_id, our_val, 
  COUNT(*) OVER (PARTITION BY our_id) AS cnt,
  ROW_NUMBER() OVER (PARTITION BY our_id ORDER BY our_val) AS rn
  FROM our_table
) AS x
WHERE rn IN ((cnt + 1)/2, (cnt + 2)/2) GROUP BY our_id;

希望能有所帮助

我发现这个答案非常有用——https://www.eversql.com/how-to-calculate-median-value-in-mysql-using-a-simple-sql-query/

SET @rowindex := -1;

SELECT
   AVG(g.grade)
FROM
   (SELECT @rowindex:=@rowindex + 1 AS rowindex,
       grades.grade AS grade
    FROM grades
    ORDER BY grades.grade) AS g
WHERE
g.rowindex IN (FLOOR(@rowindex / 2) , CEIL(@rowindex / 2));