在R中,mean()和median()是标准函数,它们执行您所期望的功能。Mode()告诉您对象的内部存储模式,而不是参数中出现次数最多的值。但是是否存在一个标准库函数来实现向量(或列表)的统计模式?


当前回答

模式并不是在所有情况下都有用。所以函数应该处理这种情况。试试下面的函数。

Mode <- function(v) {
  # checking unique numbers in the input
  uniqv <- unique(v)
  # frquency of most occured value in the input data
  m1 <- max(tabulate(match(v, uniqv)))
  n <- length(tabulate(match(v, uniqv)))
  # if all elements are same
  same_val_check <- all(diff(v) == 0)
  if(same_val_check == F){
    # frquency of second most occured value in the input data
    m2 <- sort(tabulate(match(v, uniqv)),partial=n-1)[n-1]
    if (m1 != m2) {
      # Returning the most repeated value
      mode <- uniqv[which.max(tabulate(match(v, uniqv)))]
    } else{
      mode <- "Two or more values have same frequency. So mode can't be calculated."
    }
  } else {
    # if all elements are same
    mode <- unique(v)
  }
  return(mode)
}

输出,

x1 <- c(1,2,3,3,3,4,5)
Mode(x1)
# [1] 3

x2 <- c(1,2,3,4,5)
Mode(x2)
# [1] "Two or more varibles have same frequency. So mode can't be calculated."

x3 <- c(1,1,2,3,3,4,5)
Mode(x3)
# [1] "Two or more values have same frequency. So mode can't be calculated."

其他回答

我还不能投票,但Rasmus Bååth的答案是我一直在寻找的。 但是,我将稍微修改一下,允许将分布限制在0到1之间。

estimate_mode <- function(x,from=min(x), to=max(x)) {
  d <- density(x, from=from, to=to)
  d$x[which.max(d$y)]
}

我们知道你可能根本不想约束你的分布,那么设置from=-"BIG NUMBER", to="BIG NUMBER"

下面的函数有三种形式:

method = "mode"[默认值]:计算单模态向量的模式,否则返回NA Method = "nmodes":计算vector中模式的个数 Method = "modes":列出单模态或多模态向量的所有模态

modeav <- function (x, method = "mode", na.rm = FALSE)
{
  x <- unlist(x)
  if (na.rm)
    x <- x[!is.na(x)]
  u <- unique(x)
  n <- length(u)
  #get frequencies of each of the unique values in the vector
  frequencies <- rep(0, n)
  for (i in seq_len(n)) {
    if (is.na(u[i])) {
      frequencies[i] <- sum(is.na(x))
    }
    else {
      frequencies[i] <- sum(x == u[i], na.rm = TRUE)
    }
  }
  #mode if a unimodal vector, else NA
  if (method == "mode" | is.na(method) | method == "")
  {return(ifelse(length(frequencies[frequencies==max(frequencies)])>1,NA,u[which.max(frequencies)]))}
  #number of modes
  if(method == "nmode" | method == "nmodes")
  {return(length(frequencies[frequencies==max(frequencies)]))}
  #list of all modes
  if (method == "modes" | method == "modevalues")
  {return(u[which(frequencies==max(frequencies), arr.ind = FALSE, useNames = FALSE)])}  
  #error trap the method
  warning("Warning: method not recognised.  Valid methods are 'mode' [default], 'nmodes' and 'modes'")
  return()
}

下面是一个查找模式的函数:

mode <- function(x) {
  unique_val <- unique(x)
  counts <- vector()
  for (i in 1:length(unique_val)) {
    counts[i] <- length(which(x==unique_val[i]))
  }
  position <- c(which(counts==max(counts)))
  if (mean(counts)==max(counts)) 
    mode_x <- 'Mode does not exist'
  else 
    mode_x <- unique_val[position]
  return(mode_x)
}

您还可以计算一个实例在您的集合中出现的次数,并找到最大次数。如。

> temp <- table(as.vector(x))
> names (temp)[temp==max(temp)]
[1] "1"
> as.data.frame(table(x))
r5050 Freq
1     0   13
2     1   15
3     2    6
> 

抱歉,我可能把它理解得太简单了,但这不是可以工作的吗?(我的机器上的1E6值在1.3秒内):

t0 <- Sys.time()
summary(as.factor(round(rnorm(1e6), 2)))[1]
Sys.time()-t0

你只需要用你的向量替换“round(rnorm(1e6),2)”。