On 2017-09-26 11:35, Joris Meys wrote:
I don't like the dropping of dimensions either. That doesn't change the
fact that a tibble reacts different from a data.frame. So tibbles do not
inherit correctly from the class data.frame, and it can thus be argued
that it's against OOP paradigms to pretend tibbles inherit from the
class data.frame. Defensive coding techniques would check if it's a
tibble and return an error saying a data.frame is expected. Unless
tibbles inherit correctly from data.frame.
The correct and logical way (which I use in 'eha') is to check if input
is a data frame, and if not, throw an error. Checking for other things
would soon be too overwhelming.
I have nothing against tibbles. But calling them "data.frame" raises
expectations that can't be fulfilled.
Exactly what I think. I wouldn't object to changing base data frames to
behave like tibbles (with a few exceptions).
Göran
On Tue, Sep 26, 2017 at 11:23 AM, Stefan McKinnon Høj-Edwards
<s...@iysik.com <mailto:s...@iysik.com>> wrote:
Thanks for the examples. Personally, I have been struck out multiple
times by data frames dropping dimensions, so I have a distaste for
this dropping behaviour.
Personally, I prefer data frame *not* to drop dimensions. They are
not arrays, where slicing drops a dimension makes sense because all
entries are same data type.
You can pull out a column in vector form from both tribbles and data
frame with the $ index; subsetting a row from a data frame and
forcing it into an atomic vector will require cast all columns to
lowest common denominator, often character.
So I would argue that yes, tribbles are data.frame with extra bells
and whistles, even if I do not understand the use of list columns.
I suggest a defensive coding technique; if you need a data frame
subset to really be a vector, cast it as a vector. Users *will*
attempt to throw unexpected structures at your methods. When your
methods fails in mysterious ways because it didn't extract a vector,
users will be stupefied. Fail at `as.vector` will indicate why.
Kindly,
Stefan
Stefan McKinnon Høj-Edwards
ph.d. Genetics
+44 (0)776 231 2464 <tel:+44%207762%20312464>
+45 2888 6598 <tel:+45%2028%2088%2065%2098>
Skype: stefan_edwards
2017-09-26 10:05 GMT+01:00 Joris Meys <joris.m...@ugent.be
<mailto:joris.m...@ugent.be>>:
Here's one difference:
atib <- tibble(a = 1:5, b = letters[5:1])
atib[3,"a"]
as.data.frame(atib)[3,"a"]
The second line returns a tibble (no dropping dimensions), the
third line does (dropping dimensions). Huge difference if you
use [ , aColumn] to select a vector from a data frame.
Cheers
Joris
On Tue, Sep 26, 2017 at 10:57 AM, Stefan McKinnon Høj-Edwards
<s...@iysik.com <mailto:s...@iysik.com>> wrote:
Hi Göran,
Could you please elaborate on which kind of subsetting that
Hadley dislikes?
I am yet to encounter operations on data frames that are not
possible on
tribbles.
Kindly,
Stefan McKinnon Hoj-Edwards
Stefan McKinnon Høj-Edwards
ph.d. Genetics
+44 (0)776 231 2464 <tel:%2B44%20%280%29776%20231%202464>
+45 2888 6598 <tel:%2B45%202888%206598>
Skype: stefan_edwards
2017-09-26 8:30 GMT+01:00 Göran Broström
<goran.brost...@umu.se <mailto:goran.brost...@umu.se>>:
> I am beginning to get complaints from users of my CRAN
packages
> (especially 'eha') to the effect that they get error
messages like "Error:
> Unsupported use of matrix or array for column indexing".
>
> It turns out that they are sticking in tibbles into
functions that expect
> data frames as input. And I am using the kind of
subsetting that Hadley
> dislikes (eha is an old package, much older than
tibbles). It is of course
> a simple matter to change the code so it handles both
data frames and
> tibbles correctly, but this affects many functions, and
it will take some
> time. And when the next guy introduces 'troubles' as an
improvement of
> 'tibbles', I will have to rewrite the code again.
>
> While I like Hadley's way of doing it, I think it is a
mistake to let a
> tibble also be of class data frame. To me it is a matter
of inheritance and
> backwards compability: A tibble should add nice things to
a data frame, not
> change basic behaviour, in order to call itself a data frame.
>
> Is it correct to let a tibble be of class "data.frame"?
>
> Göran Broström
>
> ______________________________________________
> R-package-devel@r-project.org
<mailto:R-package-devel@r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
<https://stat.ethz.ch/mailman/listinfo/r-package-devel>
[[alternative HTML version deleted]]
______________________________________________
R-package-devel@r-project.org
<mailto:R-package-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel
<https://stat.ethz.ch/mailman/listinfo/r-package-devel>
--
Joris Meys
Statistical consultant
Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics
tel : +32 9 264 59 87 <tel:+32%209%20264%2059%2087>
joris.m...@ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
<http://helpdesk.ugent.be/e-maildisclaimer.php>
--
Joris Meys
Statistical consultant
Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics
tel : +32 9 264 59 87
joris.m...@ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel