Hi Ralph again, On Wednesday 06 of January 2010 20:44:57 Ralf Wildenhues wrote: > Hello Tomas, > > * Tomas Oberhuber wrote on Sat, Jan 02, 2010 at 11:33:46AM CET: > > Now I try to compile whole project with nvcc. It seems to work but I get > > this > > > > ibtool: link: > > nvcc -shared -nostdlib > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu > >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser. > >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner > >.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta > >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript > >ion.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript > >ionScanner.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript > >ionParser.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o > > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix > >.o -L/usr/local/cuda/lib64 -lcppunit -lcudart -Wl,-soname > > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 nvcc fatal : Unknown > > option 'nostdlib' > > > > which means that nvcc is also used as linker. Even if I remove -nostdlib, > > nvcc complains about other parameters. So I think it would be better to > > link with g++. Can I change linker somehow? And in that case if I do it > > by hand (copy the command on the command line and replace nvcc by g++) I > > get this > > > > g++ -shared -nostdlib > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu > >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser. > >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner > >.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta > >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript > >ion.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript > >ionScanner.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript > >ionParser.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o > > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o > > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix > >.o -L/usr/local/cuda/lib64 -lcppunit -lcudart -Wl,-soname > > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 /usr/bin/ld: > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu > >re.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when > > making a shared object; recompile with -fPIC > > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu > >re.o: could not read symbols: Bad value > > collect2: ld returned 1 exit status > > > > Or maybe we can solve it using -Xcompiler nad -Xlinker. May I ask what > > does libtool do now in case we use nvcc to compile or link? > > You're right. Libtool doesn't support CXX=nvcc yet, and we also forgot > some bits of CC=nvcc support. This still needs to be done in Libtool. > > Thanks, > Ralf
so yesterday I found that it is not so simple as I thought. There is a problem with dependencies. They cannot be solved by gcc but directly by nvcc otherwise we get something like this cudefile.lo cudafile.o: \ /tmp/tmpxft_0000021f_00000000-10_cudafile.ii Moreover, any other headers are omitted because they were already processed by nvcc preprocessor (it is my guess). The result of this is that we cannot solve dependencies by gcc but by nvcc. I have found that nvcc has a flag -M which generates dependencies. Unfortunately no fast dependencies are possible here (I understood that it means that gcc3 is able to compile and generate dependencies at the same time, nvcc does not seem to do this). So I erased all the stuff around am__fastdepnvcc which I introduced yesterday :-(. Another ugly think is that nvcc is not able to filter out system headers and so the depedency files are pretty large :-(. Thinks are complicated even more. nvcc uses flag -o for the target file with dependencies which is confusing for libtool. I did not fully understood whats going on between libtool and depcomp but I think that if we call depcomp we want it to generate dependencies as well as compile the source file. depcomp filter the arguments a then in fact it calls libtool. I handled it somehow in depcomp but this is by no means nice solution. It seems to work :) and I really hope that I will not experience another problems. As I said yesterday, the way it works now is just the simplest solution to get it work. I would like to solve this properly and in my opinion nvcc should by only used for .cu files. The following is patch for my yesterday version. In fact, it rejects all changes made in files depend2.am and depend.m4. The main changes are in depcomp. diff -r automake-1.11.1/lib/am/depend2.am autotools/automake-1.11.1/lib/am/depend2.am 73a74,84 > if %FASTDEPNVCC% > ## Fast-dep mode for nvcc is similar to gcc > ## We just add -Xcompiler flag. > ?!GENERIC? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% %SOURCEFLAG%`test -f '%SOURCE%' || echo '$(srcdir)/'`%SOURCE% > ?!GENERIC? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po > ?GENERIC??!SUBDIROBJ? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% %SOURCEFLAG%%SOURCE% > ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po > ?GENERIC??SUBDIROBJ? %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$| $(DEPDIR)/&|;s|\.o$$||'`;\ > ?GENERIC??SUBDIROBJ? %COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ% %SOURCEFLAG%%SOURCE% &&\ > ?GENERIC??SUBDIROBJ? $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po > else !%FASTDEPNVCC% 86a98 > endif !%FASTDEPNVCC% 102a115,125 > if %FASTDEPNVCC% > ## In fast-dep mode, we can always use -o. > ## For non-suffix rules, we must emulate a VPATH search on %SOURCE%. > ?!GENERIC? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% %SOURCEFLAG%`if test -f '%SOURCE%'; then $(CYGPATH_W) '%SOURCE%'; else $(CYGPATH_W) '$(srcdir)/%SOURCE%'; fi` > ?!GENERIC? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po > ?GENERIC??!SUBDIROBJ? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% %SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'` > ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po > ?GENERIC??SUBDIROBJ? %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$| $(DEPDIR)/&|;s|\.obj$$||'`;\ > ?GENERIC??SUBDIROBJ? %COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ% %SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'` &&\ > ?GENERIC??SUBDIROBJ? $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po > else !%FASTDEPNVCC% 115a139 > endif !%FASTDEPNVCC% 132a157,166 > if %FASTDEPNVCC% > ## fast-dep mode for nvcc only add -Xcompiler > ?!GENERIC? %VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ% %SOURCEFLAG%`test -f '%SOURCE%' || echo '$(srcdir)/'`%SOURCE% > ?!GENERIC? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo > ?GENERIC??!SUBDIROBJ? %VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ% %SOURCEFLAG%%SOURCE% > ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo > ?GENERIC??SUBDIROBJ? %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$| $(DEPDIR)/&|;s|\.lo$$||'`;\ > ?GENERIC??SUBDIROBJ? %LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ% %SOURCEFLAG%%SOURCE% &&\ > ?GENERIC??SUBDIROBJ? $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo > else !%FASTDEPNVCC% 141a176 > endif !%FASTDEPNVCC% diff -r automake-1.11.1/lib/depcomp autotools/automake-1.11.1/lib/depcomp 97,98d96 < echo $@ < 128,146c126,128 < ## nVidia CUDA 2.3 does not suppport fast-dep mode :-( < ## this part is ugly someone should rewrite it < ## 1. nvcc flag fro dependencies is -M < ## however nvcc does not filter system headers :-( < ## 2. the output file for the dependencies is given by -o < ## which is confusing for libtool and so we proceed as follows < ## a. we need to call directly nvcc therefore we filter out args like: < ## /bin/bash (this is not robust enough it works only with bash) < ## ../../libtool < ## --tag=CXX or --tag=CC < ## --mode=compile < ## what remains after filtering is < ## nvcc -M -odir $depfiledir $source < ## b. we call something like < ## nvcc -M -odir $depfiledir $source > $tmpdepfile < ## 3. as I understood libtool assumes that calling depcomp < ## initiates compilation. Therefore we call again given arguments. < depfiledir=`dirname $depfile` < ARG_STORE=$@ --- > ## nVidia CUDA 2.3 compiler combined with gcc3 > ## here we just add -Xcompiler parameter to pass > ## gcc3 parameters to gcc3 150,156c132 < -c) set fnord "$@" -M ;; < -o) set fnord "$@" -odir "$depfiledir" ;; < "$object") set fnord "$@" ;; < *libtool) set fnord "$@" ;; < --tag*) set fnord "$@" ;; < --mode*) set fnord "$@" ;; < *bash) set fnord "$@" ;; --- > -c) set fnord "$@" -Xcompiler -MT -Xcompiler "$object" -Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler "$tmpdepfile" "$arg" ;; 162,163c138,140 < if "$@" > "$tmpdepfile"; then < mv "$tmpdepfile" "$depfile" --- > "$@" > stat=$? > if test $stat -eq 0; then : 166c143 < exit 255; --- > exit $stat 168,170c145 < < $ARG_STORE < exit $?; --- > mv "$tmpdepfile" "$depfile" diff -r automake-1.11.1/m4/depend.m4 autotools/automake-1.11.1/m4/depend.m4 155a156,158 > AM_CONDITIONAL([am__fastdepnvcc$1], [ > test "x$enable_dependency_tracking" != xno \ > && test "$am_cv_$1_dependencies_compiler_type" = nvcc]) Cheers, Tomas.