On 12/04/2024 20:02, Ajit Agarwal wrote: > Hello Alex: > > On 11/04/24 7:55 pm, Alex Coplan wrote: > > On 10/04/2024 23:48, Ajit Agarwal wrote: > >> Hello Alex: > >> > >> On 10/04/24 7:52 pm, Alex Coplan wrote: > >>> Hi Ajit, > >>> > >>> On 10/04/2024 15:31, Ajit Agarwal wrote: > >>>> Hello Alex: > >>>> > >>>> On 10/04/24 1:42 pm, Alex Coplan wrote: > >>>>> Hi Ajit, > >>>>> > >>>>> On 09/04/2024 20:59, Ajit Agarwal wrote: > >>>>>> Hello Alex: > >>>>>> > >>>>>> On 09/04/24 8:39 pm, Alex Coplan wrote: > >>>>>>> On 09/04/2024 20:01, Ajit Agarwal wrote: > >>>>>>>> Hello Alex: > >>>>>>>> > >>>>>>>> On 09/04/24 7:29 pm, Alex Coplan wrote: > >>>>>>>>> On 09/04/2024 17:30, Ajit Agarwal wrote: > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 05/04/24 10:03 pm, Alex Coplan wrote: > >>>>>>>>>>> On 05/04/2024 13:53, Ajit Agarwal wrote: > >>>>>>>>>>>> Hello Alex/Richard: > >>>>>>>>>>>> > >>>>>>>>>>>> All review comments are incorporated. > >>> <snip> > >>>>>>>>>>>> @@ -2890,8 +3018,8 @@ ldp_bb_info::merge_pairs (insn_list_t > >>>>>>>>>>>> &left_list, > >>>>>>>>>>>> // of accesses. If we find two sets of adjacent accesses, call > >>>>>>>>>>>> // merge_pairs. > >>>>>>>>>>>> void > >>>>>>>>>>>> -ldp_bb_info::transform_for_base (int encoded_lfs, > >>>>>>>>>>>> - access_group &group) > >>>>>>>>>>>> +pair_fusion_bb_info::transform_for_base (int encoded_lfs, > >>>>>>>>>>>> + access_group &group) > >>>>>>>>>>>> { > >>>>>>>>>>>> const auto lfs = decode_lfs (encoded_lfs); > >>>>>>>>>>>> const unsigned access_size = lfs.size; > >>>>>>>>>>>> @@ -2909,7 +3037,7 @@ ldp_bb_info::transform_for_base (int > >>>>>>>>>>>> encoded_lfs, > >>>>>>>>>>>> access.cand_insns, > >>>>>>>>>>>> lfs.load_p, > >>>>>>>>>>>> access_size); > >>>>>>>>>>>> - skip_next = access.cand_insns.empty (); > >>>>>>>>>>>> + skip_next = bb_state->cand_insns_empty_p > >>>>>>>>>>>> (access.cand_insns); > >>>>>>>>>>> > >>>>>>>>>>> As above, why is this needed? > >>>>>>>>>> > >>>>>>>>>> For rs6000 we want to return always true. as load store pair > >>>>>>>>>> that are to be merged with 8/16 16/32 32/64 is occuring for rs6000. > >>>>>>>>>> And we want load store pair to 8/16 32/64. Thats why we want > >>>>>>>>>> to generate always true for rs6000 to skip pairs as above. > >>>>>>>>> > >>>>>>>>> Hmm, sorry, I'm not sure I follow. Are you saying that for rs6000 > >>>>>>>>> you have > >>>>>>>>> load/store pair instructions where the two arms of the access are > >>>>>>>>> storing > >>>>>>>>> operands of different sizes? Or something else? > >>>>>>>>> > >>>>>>>>> As it stands the logic is to skip the next iteration only if we > >>>>>>>>> exhausted all the candidate insns for the current access. In the > >>>>>>>>> case > >>>>>>>>> that we didn't exhaust all such candidates, then the idea is that > >>>>>>>>> when > >>>>>>>>> access becomes prev_access, we can attempt to use those candidates > >>>>>>>>> as > >>>>>>>>> the "left-hand side" of a pair in the next iteration since we > >>>>>>>>> failed to > >>>>>>>>> use them as the "right-hand side" of a pair in the current > >>>>>>>>> iteration. > >>>>>>>>> I don't see why you wouldn't want that behaviour. Please can you > >>>>>>>>> explain? > >>>>>>>>> > >>>>>>>> > >>>>>>>> In merge_pair we get the 2 load candiates one load from 0 offset and > >>>>>>>> other load is from 16th offset. Then in next iteration we get load > >>>>>>>> from 16th offset and other load from 32 offset. In next iteration > >>>>>>>> we get load from 32 offset and other load from 48 offset. > >>>>>>>> > >>>>>>>> For example: > >>>>>>>> > >>>>>>>> Currently we get the load candiates as follows. > >>>>>>>> > >>>>>>>> pairs: > >>>>>>>> > >>>>>>>> load from 0th offset. > >>>>>>>> load from 16th offset. > >>>>>>>> > >>>>>>>> next pairs: > >>>>>>>> > >>>>>>>> load from 16th offset. > >>>>>>>> load from 32th offset. > >>>>>>>> > >>>>>>>> next pairs: > >>>>>>>> > >>>>>>>> load from 32th offset > >>>>>>>> load from 48th offset. > >>>>>>>> > >>>>>>>> Instead in rs6000 we should get: > >>>>>>>> > >>>>>>>> pairs: > >>>>>>>> > >>>>>>>> load from 0th offset > >>>>>>>> load from 16th offset. > >>>>>>>> > >>>>>>>> next pairs: > >>>>>>>> > >>>>>>>> load from 32th offset > >>>>>>>> load from 48th offset. > >>>>>>> > >>>>>>> Hmm, so then I guess my question is: why wouldn't you consider merging > >>>>>>> the pair with offsets (16,32) for rs6000? Is it because you have a > >>>>>>> stricter alignment requirement on the base pair offsets (i.e. they > >>>>>>> have > >>>>>>> to be a multiple of 32 when the operand size is 16)? So the pair > >>>>>>> offsets have to be a multiple of the entire pair size rather than a > >>>>>>> single operand size> > >>>>>> > >>>>>> We get load pair at a certain point with (0,16) and other program > >>>>>> point we get load pair (32, 48). > >>>>>> > >>>>>> In current implementation it takes offsets loads as (0, 16), > >>>>>> (16, 32), (32, 48). > >>>>>> > >>>>>> But In rs6000 we want the load pair to be merged at different points > >>>>>> as (0,16) and (32, 48). for (0,16) we want to replace load lxvp with > >>>>>> 0 offset and other load (32, 48) with lxvp with 32 offset. > >>>>>> > >>>>>> In current case it will merge with lxvp with 0 offset and lxvp with > >>>>>> 16 offset, then lxvp with 32 offset and lxvp with 48 offset which > >>>>>> is incorrect in our case as the (16-32) case 16 offset will not > >>>>>> load from even register and break for rs6000. > >>>>> > >>>>> Sorry, I think I'm still missing something here. Why does the address > >>>>> offset > >>>>> affect the parity of the tranfser register? ISTM they needn't be > >>>>> related at > >>>>> all (and indeed we can't even know the parity of the transfer register > >>>>> before > >>>>> RA, but perhaps you're only intending on running the pass after RA?) > >>>>> > >>>> > >>>> We have load pair with (0,16) wherein these loads are adjacent and > >>>> replaced with lxvp. > >>>> > >>>> Semantic of lxvp instruction is that it loads adjacent load pair in > >>>> even register and even_register + 1. > >>>> > >>>> We replace the above load pair with lxvp instruction and then we > >>>> dont need to merge (16,32) as (0, 16) is already merged and instead > >>>> we merge (32,48). > >>> > >>> Ok, but the existing logic should already account for this. I.e. if we > >>> successfully merge (0,16), then we don't attempt to merge (16,32). We'd > >>> only > >>> attempt to merge (16,32) if the merge of (0,16) failed (for whatever > >>> reason). > >>> So I don't see that there's anything specific to lxvp that requires this > >>> logic > >>> to change, _unless_ you have a stricter alignment requirement on the > >>> offsets as > >>> I mentioned before. > >>> > >> > >> Thanks for the suggestion. It worked for rs6000 also with current changes. > >> Sorry for the confusion. > > > > Alright, glad we got to the bottom of this! > > Thanks. > > > >> > >>>> > >>>> Yes you are correct, the addresss offset doesn't affect the parity of > >>>> the register transfer as we are doing fusion pass before RA. > >>>> If that is the case then I think it would be better to introduce a > >>>>>>> virtual function (say pair_offset_alignment_ok_p) that vets the base > >>>>>>> offset of the pair (prev_access->offset in transform_for_base). I > >>>>>>> guess > >>>>>>> it would also take access_size as a parameter and for aarch64 it > >>>>>>> should > >>>>>>> check: > >>>>>>> > >>>>>>> multiple_p (offset, access_size) > >>>>>>> > >>>>>>> and for rs6000 it could check: > >>>>>>> > >>>>>>> multiple_p (offset, access_size * 2) > >>>>>>> > >>>>>>> and we would then incorporate a call to that predicate in the else if > >>>>>>> condition of tranform_for_base. > >>>>>>> > >>>>>>> It would have the same limitation whereby we assume that MEM_EXPR > >>>>>>> offset > >>>>>>> alignment is a good proxy for RTL offset alignment, but we already > >>>>>>> depend on that with the multiple_p check in track_via_mem_expr. > >>>>>>> > >>>> I have addressed the above hooks and it worked fine with both rs6000 > >>>> and aarch64. I am sending subsequent patch in some time that address > >>>> above. > >>>> > >>>>> How do you plan on handling this even-odd requirement for rs6000? > >>>>> > >>>> > >>>> We plan to handle with V16QI subreg: 0 and V16QI subreg : 16 to > >>>> generate register pair and thats what we generate and implement > >>>> in rs6000 target > >>>> code. > >>> > >>> Ah, this is coming back to me now. Sorry, I should have remembered this > >>> from > >>> the previous discussion with Richard S. > >>> > >>> Apologies for going on a slight tangent here, but if you're running > >>> before RA are you planning to create a new OImode pseudo register for > >>> the lxvp insn and then somehow update uses of the old transfer registers > >>> to replace them with subregs of that OImode pseudo? > >> > >> Yes I do the same as you have mentioned. We generate register pairs > >> with 256 bit mode with two subregs of 128 bit modes with 0 and > >> 16 offset. > >> > >> Or do you just plan > >>> or replacing the individual loads with moves (instead of deleting them)? > >>> I guess the latter would be simpler and might work better in the > >>> presence of hard regs. > >>> > >> > >> Would you mind explaining how to generate register pairs with lxvp by > >> replacing loads with moves. > > > > Yeah, so suppose you have something like: > > > > (set (reg:V16QI v1) (mem:V16QI addr)) > > (set (reg:V16QI v2) (mem:V16QI addr+16)) > > > > then when you insert the lxvp you can then (logically) replace the > > original load instructions with moves from the appropriate subreg, as > > follows: > > > > (set (reg:OI pair-pseudo) (mem:OI addr)) ; lxvp > > (set (reg:V16QI v1) (subreg:V16QI (reg:OI pair-pseudo) 0)) > > (set (reg:V16QI v2) (subreg:V16QI (reg:OI pair-pseudo) 16)) > > > > Any Pseudo created with gen_rtx_REG like > gen_rtx_REG (OOmode, REGNO (dest_exp) will error'ed out > with unrecognize insn by LRA.
I'm not surprised that goes wrong: you can't just create a new REG rtx in a different mode and reuse the regno of an existing pseudo. > > If I create pseudo with gen_reg_rtx (OOmode) will error'ed > out with new_defs Pseudo register is not found in > change->new_defs. Yeah, I suppose you'd need to add an RTL-SSA def for the new pseudo. > > Also the sequential register pairs are not generated by > Register Allocator. So how do you get the sequential pairs with your current approach? My understanding was that what I suggested above doesn't really change what you're doing w.r.t the lxvp insn itself, but maybe I didn't properly understand the approach taken in your initial patchset. Thanks, Alex > > Thats why I haven't used above method as I also thought > through. > > Please let me know what do you think. > > Thanks & Regards > Ajit > > now I'm not sure off-hand if this exact combination of subregs and mode > > changes is valid, but hopefully you get the idea. The benefit of this > > approach is that it keeps the transformation local and is more > > compile-time efficient (we don't have to look through all mentions of > > v1/v2 and replace them with subregs). I'd hope that the RA can then > > clean up any redundant moves (especially in the case that v1/v2 are > > pseudos). > > > > That would mean that you don't need all the grubbing around with DF > > looking for occurrences of the transfer regs. Instead we'd just need > > some logic to insert the extra moves after the lxvp for rs6000, and I > > think this might fit more cleanly into the current pass. > > > > Does that make sense? In any case, this shouldn't really affect the > > preparatory aarch64 patch because we were planning to defer adding any > > hooks that are only needed for rs6000 from the initial aarch64/generic > > split. > > > > Thanks, > > Alex > > > >> > >> Thanks & Regards > >> Ajit > >> > >>> Thanks, > >>> Alex > >>> > >>>> > >>>> Thanks & Regards > >>>> Ajit > >>>>> Thanks, > >>>>> Alex > >>>>> > >>>>>> > >>>>>> lxvp should load from even registers and then loaded value will > >>>>>> be in even register and even register +1 (which is odd). > >>>>>> > >>>>>> Thanks & Regards > >>>>>> Ajit > >>>>>>> If that is the case then I think it would be better to introduce a > >>>>>>> virtual function (say pair_offset_alignment_ok_p) that vets the base > >>>>>>> offset of the pair (prev_access->offset in transform_for_base). I > >>>>>>> guess > >>>>>>> it would also take access_size as a parameter and for aarch64 it > >>>>>>> should > >>>>>>> check: > >>>>>>> > >>>>>>> multiple_p (offset, access_size) > >>>>>>> > >>>>>>> and for rs6000 it could check: > >>>>>>> > >>>>>>> multiple_p (offset, access_size * 2) > >>>>>>> > >>>>>>> and we would then incorporate a call to that predicate in the else if > >>>>>>> condition of tranform_for_base. > >>>>>>> > >>>>>>> It would have the same limitation whereby we assume that MEM_EXPR > >>>>>>> offset > >>>>>>> alignment is a good proxy for RTL offset alignment, but we already > >>>>>>> depend on that with the multiple_p check in track_via_mem_expr. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Alex > >>>>>>> > >>>>>>>> > >>>>>>>> Thanks & Regards > >>>>>>>> Ajit > >>>>>>>>> Thanks, > >>>>>>>>> Alex > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> } > >>>>>>>>>>>> prev_access = &access; > >>>>>>>>>>>> } > >>>>>>>>>>>> @@ -2919,7 +3047,7 @@ ldp_bb_info::transform_for_base (int > >>>>>>>>>>>> encoded_lfs, > >>>>>>>>>>>> // and remove all the tombstone insns, being sure to reparent > >>>>>>>>>>>> any uses > >>>>>>>>>>>> // of mem to previous defs when we do this. > >>>>>>>>>>>> void > >>>>>>>>>>>> -ldp_bb_info::cleanup_tombstones () > >>>>>>>>>>>> +pair_fusion_bb_info::cleanup_tombstones () > >>>>>>>>>>>> { > >>>>>>>>>>>> // No need to do anything if we didn't emit a tombstone insn > >>>>>>>>>>>> for this BB. > >>>>>>>>>>>> if (!m_emitted_tombstone) > >>>>>>>>>>>> @@ -2947,7 +3075,7 @@ ldp_bb_info::cleanup_tombstones () > >>>>>>>>>>>> > >>>>>>>>>>>> template<typename Map> > >>>>>>>>>>>> void > >>>>>>>>>>>> -ldp_bb_info::traverse_base_map (Map &map) > >>>>>>>>>>>> +pair_fusion_bb_info::traverse_base_map (Map &map) > >>>>>>>>>>>> { > >>>>>>>>>>>> for (auto kv : map) > >>>>>>>>>>>> { > >>>>>>>>>>>> @@ -2958,7 +3086,7 @@ ldp_bb_info::traverse_base_map (Map &map) > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> void > >>>>>>>>>>>> -ldp_bb_info::transform () > >>>>>>>>>>>> +pair_fusion_bb_info::transform () > >>>>>>>>>>>> { > >>>>>>>>>>>> traverse_base_map (expr_map); > >>>>>>>>>>>> traverse_base_map (def_map); > >>>>>>>>>>>> @@ -3167,14 +3295,13 @@ try_promote_writeback (insn_info *insn) > >>>>>>>>>>>> // for load/store candidates. If running after RA, also try > >>>>>>>>>>>> and promote > >>>>>>>>>>>> // non-writeback pairs to use writeback addressing. Then try > >>>>>>>>>>>> to fuse > >>>>>>>>>>>> // candidates into pairs. > >>>>>>>>>>>> -void ldp_fusion_bb (bb_info *bb) > >>>>>>>>>>>> +void pair_fusion::ldp_fusion_bb (bb_info *bb) > >>>>>>>>>>>> { > >>>>>>>>>>>> - const bool track_loads > >>>>>>>>>>>> - = aarch64_tune_params.ldp_policy_model != > >>>>>>>>>>>> AARCH64_LDP_STP_POLICY_NEVER; > >>>>>>>>>>>> - const bool track_stores > >>>>>>>>>>>> - = aarch64_tune_params.stp_policy_model != > >>>>>>>>>>>> AARCH64_LDP_STP_POLICY_NEVER; > >>>>>>>>>>>> + const bool track_loads = track_load_p (); > >>>>>>>>>>>> + const bool track_stores = track_store_p (); > >>>>>>>>>>>> > >>>>>>>>>>>> - ldp_bb_info bb_state (bb); > >>>>>>>>>>> > >>>>>>>>>>> This: > >>>>>>>>>>> > >>>>>>>>>>>> + aarch64_pair_fusion derived; > >>>>>>>>>>> > >>>>>>>>>>> can be deleted and then: > >>>>>>>>>>> > >>>>>>>>>>>> + pair_fusion_bb_info bb_info (bb, &derived); > >>>>>>>>>>> > >>>>>>>>>>> can just be: > >>>>>>>>>>> > >>>>>>>>>>> pair_fusion_bb_info bb_info (bb, this); > >>>>>>>>>>> > >>>>>>>>>>> (or you can pass *this if you make bb_info take a reference). > >>>>>>>>>>> > >>>>>>>>>>> I don't think there's a particular need to change the variable > >>>>>>>>>>> name > >>>>>>>>>>> (bb_state -> bb_info). I chose the former because it doens't > >>>>>>>>>>> clash > >>>>>>>>>>> with the RTL-SSA structure of the same name as the latter. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Addressed. > >>>>>>>>>>>> > >>>>>>>>>>>> for (auto insn : bb->nondebug_insns ()) > >>>>>>>>>>>> { > >>>>>>>>>>>> @@ -3184,31 +3311,31 @@ void ldp_fusion_bb (bb_info *bb) > >>>>>>>>>>>> continue; > >>>>>>>>>>>> > >>>>>>>>>>>> rtx pat = PATTERN (rti); > >>>>>>>>>>>> - if (reload_completed > >>>>>>>>>>>> - && aarch64_ldp_writeback > 1 > >>>>>>>>>>>> - && GET_CODE (pat) == PARALLEL > >>>>>>>>>>>> - && XVECLEN (pat, 0) == 2) > >>>>>>>>>>>> + if (pair_mem_promote_writeback_p (pat)) > >>>>>>>>>>>> try_promote_writeback (insn); > >>>>>>>>>>> > >>>>>>>>>>> It looks like try_promote_writeback itself will need some further > >>>>>>>>>>> work > >>>>>>>>>>> to make it target-independent. I suppose this check: > >>>>>>>>>>> > >>>>>>>>>>> auto rti = insn->rtl (); > >>>>>>>>>>> const auto attr = get_attr_ldpstp (rti); > >>>>>>>>>>> if (attr == LDPSTP_NONE) > >>>>>>>>>>> return; > >>>>>>>>>>> > >>>>>>>>>>> bool load_p = (attr == LDPSTP_LDP); > >>>>>>>>>>> gcc_checking_assert (load_p || attr == LDPSTP_STP); > >>>>>>>>>>> > >>>>>>>>>>> will need to become part of the pair_mem_promote_writeback_p hook > >>>>>>>>>>> that you > >>>>>>>>>>> added, potentially changing it to return a boolean for load_p. > >>>>>>>>>>> > >>>>>>>>>>> Then I guess we will need hooks for destructuring the pair insn > >>>>>>>>>>> and > >>>>>>>>>>> another hook to wrap aarch64_gen_writeback_pair. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Addressed. > >>>>>>>>>>>> > >>>>>>>>>>>> if (GET_CODE (pat) != SET) > >>>>>>>>>>>> continue; > >>>>>>>>>>>> > >>>>>>>>>>>> if (track_stores && MEM_P (XEXP (pat, 0))) > >>>>>>>>>>>> - bb_state.track_access (insn, false, XEXP (pat, 0)); > >>>>>>>>>>>> + bb_info.track_access (insn, false, XEXP (pat, 0)); > >>>>>>>>>>>> else if (track_loads && MEM_P (XEXP (pat, 1))) > >>>>>>>>>>>> - bb_state.track_access (insn, true, XEXP (pat, 1)); > >>>>>>>>>>>> + bb_info.track_access (insn, true, XEXP (pat, 1)); > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> - bb_state.transform (); > >>>>>>>>>>>> - bb_state.cleanup_tombstones (); > >>>>>>>>>>>> + bb_info.transform (); > >>>>>>>>>>>> + bb_info.cleanup_tombstones (); > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> void ldp_fusion () > >>>>>>>>>>>> { > >>>>>>>>>>>> ldp_fusion_init (); > >>>>>>>>>>>> + pair_fusion *pfuse; > >>>>>>>>>>>> + aarch64_pair_fusion derived; > >>>>>>>>>>>> + pfuse = &derived; > >>>>>>>>>>> > >>>>>>>>>>> This is indeed the one place where I think it is acceptable to > >>>>>>>>>>> instantiate aarch64_pair_fusion. But again there's no need to > >>>>>>>>>>> create a > >>>>>>>>>>> pointer to the parent class, just call any function you like > >>>>>>>>>>> directly. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Addressed. > >>>>>>>>>>>> > >>>>>>>>>>>> for (auto bb : crtl->ssa->bbs ()) > >>>>>>>>>>>> - ldp_fusion_bb (bb); > >>>>>>>>>>>> + pfuse->ldp_fusion_bb (bb); > >>>>>>>>>>> > >>>>>>>>>>> I think even the code to iterate over bbs should itself be a > >>>>>>>>>>> member > >>>>>>>>>>> function of pair_fusion (say "run") and then that becomes part of > >>>>>>>>>>> the > >>>>>>>>>>> generic code. > >>>>>>>>>>> > >>>>>>>>>>> So this function would just become: > >>>>>>>>>>> > >>>>>>>>>>> aarch64_pair_fusion pass; > >>>>>>>>>>> pass.run (); > >>>>>>>>>>> > >>>>>>>>>>> and could be inlined into the caller. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Addressed. > >>>>>>>>>>> Perhaps you could also add an early return like: > >>>>>>>>>>> > >>>>>>>>>>> if (!track_loads_p () && !track_stores_p ()) > >>>>>>>>>>> return; > >>>>>>>>>>> > >>>>>>>>>>> in pair_fusion::run () and then remove the corresponding code from > >>>>>>>>>>> pass_ldp_fusion::gate? > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Addressed. > >>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> ldp_fusion_destroy (); > >>>>>>>>>>>> } > >>>>>>>>>>>> -- > >>>>>>>>>>>> 2.39.3 > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Alex > >>>>>>>>>> > >>>>>>>>>> Thanks & Regards > >>>>>>>>>> Ajit