lisoda, I don't think there is a good way to fix the HadoopCatalog
implementation. That's why we recommend not using it.
In the quickstart, the assumption is that you're using a Hive catalog. The
HadoopCatalog example shows how to add additional catalogs (in this case, a
local one for testing). I
Hello steven.
HadoopCatalog does have many problems, but because the community added it to
the QuickStart chapter in the first place, many users have actually stayed with
hadoopCatalog. There is a huge cost to switching catalogs. In addition, HIVE
even uses HadoopCatalog as an implementation o
Lisoda, HadoopCatalog has many issues for production usage like Dan said.
It has never been recommended in production. It was widely used in unit
test code, which is also slowly moving toward InMemoryCatalog. As the
community is aligned behind the REST catalog, it is preferable to limit the
work re
Again, it's my "vision": if the community wants to maintain and move
forward on HadoopCatalog, that's fine (not sure it would be a good
idea regarding the "limitations" of filesystem based catalog :)).
Let's see what the others are thinking.
Regards
JB
On Mon, Jul 15, 2024 at 8:29 AM lisoda wro
Hi
HadoopCatalog is not a "recommended" catalog for production (at least
up to now). So, we should consider either to move it in a separate
repo (if we have the guarantee that it's gonna be maintained, else it
doesn't make sense) or remove it to avoid confusion. My take here is
the same (for sever
Iisoda,
Unfortunately, I don't agree with your assessment. The problems with file
system based catalog implementations are inherent and steps taken to
address them are not adequate to have confidence in the implementation.
Commit atomicity is not solved as it relies on locking, which has a numbe