On Sun, 24 May 2020 at 21:35, None via users <users@lists.fedoraproject.org>
wrote:

> Dear fellow fedora users,
>
> If I have a data file called 15.dat with the following content:
>
> $ cat 15.dat
> 1
> 3
> 1
> 0
> 2
>
> And I want to find min, quartile 1, median, quartile 3 and maximum (Five
> number summary)
> We can use datamash like
> $   cat 15.dat | datamash min 1 q1 1 median 1 q3 1 max 1

       0       1       1.5     2.75    6


> Q3 is reported as 2.75 but if we split the data file in half the number is
> 3.
>

R makes it easy to see what is being done:

> fivenum
function (x, na.rm = TRUE)
{
    xna <- is.na(x)
    if (any(xna)) {
        if (na.rm)
            x <- x[!xna]
        else return(rep.int(NA, 5))
    }
    x <- sort(x)
    n <- length(x)
    if (n == 0)
        rep.int(NA, 5)
    else {
        n4 <- floor((n + 3)/2)/2
        d <- c(1, n4, (n + 1)/2, n + 1 - n4, n)
        0.5 * (x[floor(d)] + x[ceiling(d)])
    }
}



-- 
George N. White III
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Reply via email to