Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leiden clustering returns different results for Scanpy 1.10.4 and 1.9.3 #3422

Closed
2 of 3 tasks
alam-shahul opened this issue Dec 30, 2024 · 23 comments
Closed
2 of 3 tasks
Assignees
Milestone

Comments

@alam-shahul
Copy link

alam-shahul commented Dec 30, 2024

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the main branch of scanpy.

What happened?

I am doing some data analysis that depends on Scanpy, and for reproducibility reasons it is important that the results from Scanpy's implementation of Leiden clustering are consistent.

It seems that the changes to the sc.tl.leiden implementation in v1.10 change the clustering results, although they are not supposed to (there is a warning that the defaults will change in the future, but that they have not yet).

Can this be patched?

Minimal code sample

import numpy as np
import anndata as ad
import scanpy as sc

print(sc.__version__)

def cluster(dataset, use_rep, random_state: int = 0, resolution: float = 1.0, n_neighbors: int = 20, method="leiden", **kwargs):
    """Test clustering."""
    clustering_function = getattr(sc.tl, method)
    sc.pp.neighbors(dataset, use_rep=use_rep, random_state=random_state, n_neighbors=n_neighbors)
    clustering_function(dataset, resolution=resolution, random_state=random_state, key_added=method, **kwargs)


normalized_X = np.array([[-0.0153584372512808719352106479050235066097229719161987304687500000,
         1.4541326990894569703982597275171428918838500976562500000000000000,
        -0.7405199696290716282476296328241005539894104003906250000000000000,
         0.9335475436959812522985657778917811810970306396484375000000000000,
         1.9652735061761130719304446756723336875438690185546875000000000000],
       [-0.8460207832608741540525443269871175289154052734375000000000000000,
        -0.7837450051463897837678018731821794062852859497070312500000000000,
         0.6811337057503309422301640552177559584379196166992187500000000000,
        -0.8943954236093275556029880135611165314912796020507812500000000000,
        -0.8289938780269628937347192731976974755525588989257812500000000000],
       [ 2.2840823361446140893349365796893835067749023437500000000000000000,
         3.0062084336245096238826590706594288349151611328125000000000000000,
         2.3961251255796200965164644003380089998245239257812500000000000000,
         1.2670404851044729799269816794549115002155303955078125000000000000,
         3.0832084540553585938482683559413999319076538085937500000000000000],
       [-0.8460207832608741540525443269871175289154052734375000000000000000,
        -0.7837450051463897837678018731821794062852859497070312500000000000,
         0.6811337057503309422301640552177559584379196166992187500000000000,
        -0.8943954236093275556029880135611165314912796020507812500000000000,
        -0.8289938780269628937347192731976974755525588989257812500000000000],
       [-0.4734242305255558225240974934422411024570465087890625000000000000,
        -0.7837450051809308204653348184365313500165939331054687500000000000,
         0.3758334286953769476369302537932526320219039916992187500000000000,
        -0.7829839766307843396120347279065754264593124389648437500000000000,
         0.0646128386684944600037994177910150028765201568603515625000000000],
       [ 3.0616708457542394228312332415953278541564941406250000000000000000,
         2.0170238337050148125229043216677382588386535644531250000000000000,
         2.1792068706079423812127515702741220593452453613281250000000000000,
         0.0156031377921165281819071424251887947320938110351562500000000000,
         0.7345204790501289604520707143819890916347503662109375000000000000],
       [-0.8273459175674521270948957862856332212686538696289062500000000000,
        -0.7837450052873845551815179533150512725114822387695312500000000000,
         0.0143552745310629691066761637330273515544831752777099609375000000,
        -0.6432356255675323319920266840199474245309829711914062500000000000,
        -0.5363944281245112133404973064898513257503509521484375000000000000],
       [-0.2532874346238604967851415494806133210659027099609375000000000000,
        -0.7837450053730236065874237283424008637666702270507812500000000000,
        -0.4857388123527687784353190636466024443507194519042968750000000000,
        -0.8943954238327194161684019491076469421386718750000000000000000000,
         0.1541426905725850615702654522465309128165245056152343750000000000],
       [-0.5305295002105545609794035044615156948566436767578125000000000000,
        -0.7837450053545629291562590879038907587528228759765625000000000000,
        -0.5192822985533931401391782856080681085586547851562500000000000000,
        -0.8943954238145227497724931708944495767354965209960937500000000000,
        -0.3497795350634944067103049292200012132525444030761718750000000000],
       [ 4.4802650742716512155539021478034555912017822265625000000000000000,
         4.4901249393718787317197893571574240922927856445312500000000000000,
         1.2781097803925081102249805553583428263664245605468750000000000000,
         2.1118083723495661985225524404086172580718994140625000000000000000,
         0.8991154865433894638471201687934808433055877685546875000000000000],
       [-0.8328382311941189275472652298049069941043853759765625000000000000,
        -0.7837450052641189435220780978852417320013046264648437500000000000,
         0.0983250549318716027258346912276465445756912231445312500000000000,
        -0.8943954237253726180512103383080102503299713134765625000000000000,
        -0.8289938784507908664522801700513809919357299804687500000000000000],
       [-0.8138959182619656118617967877071350812911987304687500000000000000,
        -0.7837450054221178907809530755912419408559799194335937500000000000,
        -0.8933306842504264988491513577173464000225067138671875000000000000,
        -0.8295537358839797370535507070599123835563659667968750000000000000,
        -0.3767077271997943954673360167362261563539505004882812500000000000],
       [-0.8460207838667667123999649447796400636434555053710937500000000000,
        -0.7616500598262874177635239902883768081665039062500000000000000000,
        -0.3644690133000215159775336815073387697339057922363281250000000000,
        -0.8793266558382332176435625115118455141782760620117187500000000000,
        -0.3399815820253418441332371457974659278988838195800781250000000000],
       [ 0.1951746611946994780506514644002891145646572113037109375000000000,
         1.3145087950204568105760927210212685167789459228515625000000000000,
        -0.0631582156087374252395605367382813710719347000122070312500000000,
         0.9749542311739864963726631685858592391014099121093750000000000000,
         1.6184747643740906575260396493831649422645568847656250000000000000],
       [ 0.4130706631407171092185137695196317508816719055175781250000000000,
         0.8685204646025441732604122080374509096145629882812500000000000000,
        -0.1163010678129516167755852507070812862366437911987304687500000000,
         0.0225392201718922699504865647668339079245924949645996093750000000,
        -0.5486454711867800870805922386352904140949249267578125000000000000],
       [ 0.4130706631407171092185137695196317508816719055175781250000000000,
         0.8685204646025441732604122080374509096145629882812500000000000000,
        -0.1163010678129516167755852507070812862366437911987304687500000000,
         0.0225392201718922699504865647668339079245924949645996093750000000,
        -0.5486454711867800870805922386352904140949249267578125000000000000],
       [ 3.0616708457542394228312332415953278541564941406250000000000000000,
         2.0170238337050148125229043216677382588386535644531250000000000000,
         2.1792068706079423812127515702741220593452453613281250000000000000,
         0.0156031377921165281819071424251887947320938110351562500000000000,
         0.7345204790501289604520707143819890916347503662109375000000000000],
       [ 0.5285075708517972259414818836376070976257324218750000000000000000,
         1.3357746201008144915078901249216869473457336425781250000000000000,
        -0.0522676883302916894802336855718749575316905975341796875000000000,
         0.5598116423932187890599720958562102168798446655273437500000000000,
         1.4624874128828768693466599870589561760425567626953125000000000000],
       [ 0.0236648636289298434920436164929924416355788707733154296875000000,
        -0.7837450049185727962708369886968284845352172851562500000000000000,
         0.5317379248941258573779578000539913773536682128906250000000000000,
        -0.7653336298750414989910950680496171116828918457031250000000000000,
         3.5244176718706072826137187803396955132484436035156250000000000000],
       [-0.8460207827758202681067700723360758274793624877929687500000000000,
        -0.7837450049259220286046456749318167567253112792968750000000000000,
         0.9343846861230339362336394515295978635549545288085937500000000000,
        -0.7257113980943485920249713672092184424400329589843750000000000000,
         2.6076365892288526637798895535524934530258178710937500000000000000],
       [ 3.4209692646135319016309495054883882403373718261718750000000000000,
         1.6022738616542011591548089199932292103767395019531250000000000000,
         2.5308411571786533222905291040660813450813293457031250000000000000,
         0.7663358136011872989712401249562390148639678955078125000000000000,
         1.5521627598155478544583729672012850642204284667968750000000000000],
       [-0.6368037004917077581467310665175318717956542968750000000000000000,
         0.4004102135685200791748172832740237936377525329589843750000000000,
         1.0927025508145216114996856049401685595512390136718750000000000000,
         0.1316298381497438829690338479849742725491523742675781250000000000,
         1.2474325507137644652999597383313812315464019775390625000000000000],
       [ 1.4702882621232871152017196436645463109016418457031250000000000000,
         0.3581580488058374034388009476970182731747627258300781250000000000,
         1.3189345483648640122709139177459292113780975341796875000000000000,
         0.6879897353102286672310583526268601417541503906250000000000000000,
         4.3223304219419631522214331198483705520629882812500000000000000000],
       [-0.6488183666006316352437011119036469608545303344726562500000000000,
        -0.7370909339001078253161836073559243232011795043945312500000000000,
        -0.1327179675795511493152645243753795512020587921142578125000000000,
        -0.6821706654414582127188282356655690819025039672851562500000000000,
         0.5146692294781037846362892196339089423418045043945312500000000000],
       [-0.8460207838661558676918161836510989814996719360351562500000000000,
        -0.7393227086572089490346115780994296073913574218750000000000000000,
        -0.7821493613682840573275711903988849371671676635742187500000000000,
        -0.7787776045560447224502809149271342903375625610351562500000000000,
        -0.8055687464231097161260208849853370338678359985351562500000000000],
       [ 0.3758298362762181876739475683280033990740776062011718750000000000,
        -0.7837450049858407652436653734184801578521728515625000000000000000,
         0.8783430291006222301319894540938548743724822998046875000000000000,
        -0.8943954234510753664721960376482456922531127929687500000000000000,
         1.5232272957369399346561067432048730552196502685546875000000000000],
       [ 0.1479965461447061425559468261781148612499237060546875000000000000,
         0.5896343380718306326215838453208561986684799194335937500000000000,
        -0.5713684202521972510524506105866748839616775512695312500000000000,
         0.4990218509748848840779089641728205606341361999511718750000000000,
         0.0690071570934271405484139449981739744544029235839843750000000000],
       [-0.8460207839181261846306369989179074764251708984375000000000000000,
        -0.7837450054451253755871675821254029870033264160156250000000000000,
        -0.8305416395655412786780402711883652955293655395507812500000000000,
        -0.8943954239037895659336641074332874268293380737304687500000000000,
        -0.5241271528255970757470549870049580931663513183593750000000000000],
       [-0.5867463162645063512456999887945130467414855957031250000000000000,
        -0.7837450050999672512830329651478677988052368164062500000000000000,
         0.4630671948358607847850976213521789759397506713867187500000000000,
        -0.3400987987774662935080982606450561434030532836914062500000000000,
         1.5480310435966109228189679924980737268924713134765625000000000000],
       [-0.4158489427679883387867221244960092008113861083984375000000000000,
        -0.7837450048804643909505784904467873275279998779296875000000000000,
        -0.0884627481374511692724382783126202411949634552001953125000000000,
         0.1351702468419188341020031884909258224070072174072265625000000000,
         0.8281873482629650284891908995632547885179519653320312500000000000],
       [-0.8145754781057903404217768184025771915912628173828125000000000000,
        -0.7837450051771055470339888415765017271041870117187500000000000000,
         0.7851849508560109214272415556479245424270629882812500000000000000,
        -0.6005046927625529162853013076528441160917282104492187500000000000,
         0.6658386916681061640232996978738810867071151733398437500000000000],
       [ 0.6521332796919149243919378022837918251752853393554687500000000000,
        -0.4115124068017598779611887493956601247191429138183593750000000000,
         1.1223355965786423471541866092593409121036529541015625000000000000,
         2.7427623393550337738133748644031584262847900390625000000000000000,
         0.8983407581903105620924065988219808787107467651367187500000000000],
       [-0.1365323442009934717944474869000259786844253540039062500000000000,
        -0.7837450049114751404744083629339002072811126708984375000000000000,
         0.3818251989468823270890140975097892805933952331542968750000000000,
        -0.1955675179365352900351382459120941348373889923095703125000000000,
         1.3237565845344692050389312498737126588821411132812500000000000000],
       [ 0.3758298362762181876739475683280033990740776062011718750000000000,
        -0.7837450049858407652436653734184801578521728515625000000000000000,
         0.8783430291006222301319894540938548743724822998046875000000000000,
        -0.8943954234510753664721960376482456922531127929687500000000000000,
         1.5232272957369399346561067432048730552196502685546875000000000000],
       [-0.8460207835339851323297466478834394365549087524414062500000000000,
        -0.6751132952741660808726464892970398068428039550781250000000000000,
         0.0618111874464656527172756739219039445742964744567871093750000000,
        -0.8943954237316867894591609911003615707159042358398437500000000000,
        -0.8289938784738521970751889966777525842189788818359375000000000000],
       [ 0.6521332796919149243919378022837918251752853393554687500000000000,
        -0.4115124068017598779611887493956601247191429138183593750000000000,
         1.1223355965786423471541866092593409121036529541015625000000000000,
         2.7427623393550337738133748644031584262847900390625000000000000000,
         0.8983407581903105620924065988219808787107467651367187500000000000],
       [-0.8460207829164639870000996779708657413721084594726562500000000000,
        -0.7837450049898475601395375633728690445423126220703125000000000000,
        -0.1444292099307478982606056661097682081162929534912109375000000000,
        -0.1426531254139287163695826166076585650444030761718750000000000000,
        -0.1498325526705227284107735385987325571477413177490234375000000000],
       [-0.8125036465000587959650601987959817051887512207031250000000000000,
        -0.7837450053509567027276716544292867183685302734375000000000000000,
        -0.1205377883312909736979179342597490176558494567871093750000000000,
        -0.8943954238109680376922483446833211928606033325195312500000000000,
        -0.8289938787634094641276760739856399595737457275390625000000000000],
       [-0.8460207832608741540525443269871175289154052734375000000000000000,
        -0.7837450051463897837678018731821794062852859497070312500000000000,
         0.6811337057503309422301640552177559584379196166992187500000000000,
        -0.8943954236093275556029880135611165314912796020507812500000000000,
        -0.8289938780269628937347192731976974755525588989257812500000000000],
       [ 2.8890836418823302444991441007005050778388977050781250000000000000,
         4.4636548213358118175619893008843064308166503906250000000000000000,
         1.6442550264869968401626465492881834506988525390625000000000000000,
         2.3938479518683108970833472994854673743247985839843750000000000000,
         2.5157885620638777979252154182177037000656127929687500000000000000],
       [-0.8460207827758202681067700723360758274793624877929687500000000000,
        -0.7837450049259220286046456749318167567253112792968750000000000000,
         0.9343846861230339362336394515295978635549545288085937500000000000,
        -0.7257113980943485920249713672092184424400329589843750000000000000,
         2.6076365892288526637798895535524934530258178710937500000000000000],
       [-0.7953100850661093401683388037781696766614913940429687500000000000,
        -0.7585137581534366901792054704856127500534057617187500000000000000,
        -0.0801628764669323201630390940408688038587570190429687500000000000,
        -0.8327300871891057942875136177462991327047348022460937500000000000,
         1.1777063917799182046763917242060415446758270263671875000000000000],
       [ 0.0195460072543246314580311917552535305730998516082763671875000000,
        -0.1088374072933334518742398699941986706107854843139648437500000000,
        -0.8230680185287699845986253421870060265064239501953125000000000000,
         0.0606202946667430825344879963267885614186525344848632812500000000,
         0.6007249216331647101441149061429314315319061279296875000000000000],
       [-0.8460207831185517779459814846632070839405059814453125000000000000,
        -0.7837450050817008628811777271039318293333053588867187500000000000,
         0.4120268149352634412529994278884259983897209167480468750000000000,
        -0.5093521698243252959770188681432045996189117431640625000000000000,
        -0.8289938777940817349332291996688582003116607666015625000000000000],
       [-0.8421473220901878331545731271035037934780120849609375000000000000,
        -0.7006778904317614298236094327876344323158264160156250000000000000,
        -0.9003074080658106659313943964662030339241027832031250000000000000,
        -0.8301161780646421073370788690226618200540542602539062500000000000,
        -0.2185970182121793614626881208096165210008621215820312500000000000],
       [-0.6488183666006316352437011119036469608545303344726562500000000000,
        -0.7370909339001078253161836073559243232011795043945312500000000000,
        -0.1327179675795511493152645243753795512020587921142578125000000000,
        -0.6821706654414582127188282356655690819025039672851562500000000000,
         0.5146692294781037846362892196339089423418045043945312500000000000],
       [-0.8460207829039355642564146364748012274503707885742187500000000000,
        -0.7837450049841531152239326729613821953535079956054687500000000000,
         0.1603547062552740376517590448202099651098251342773437500000000000,
        -0.4338038380930199222618171006615739315748214721679687500000000000,
         1.5034554656109397896557311469223350286483764648437500000000000000],
       [ 2.5412505799998994504562688234727829694747924804687500000000000000,
         5.0519281144669649918910181440878659486770629882812500000000000000,
         2.6956605446958747940300327172735705971717834472656250000000000000,
         1.8332771257582531898577826723339967429637908935546875000000000000,
         0.6058121782944819733174313114432152360677719116210937500000000000],
       [ 3.0616708457542394228312332415953278541564941406250000000000000000,
         2.0170238337050148125229043216677382588386535644531250000000000000,
         2.1792068706079423812127515702741220593452453613281250000000000000,
         0.0156031377921165281819071424251887947320938110351562500000000000,
         0.7345204790501289604520707143819890916347503662109375000000000000],
       [-0.5305295002105545609794035044615156948566436767578125000000000000,
        -0.7837450053545629291562590879038907587528228759765625000000000000,
        -0.5192822985533931401391782856080681085586547851562500000000000000,
        -0.8943954238145227497724931708944495767354965209960937500000000000,
        -0.3497795350634944067103049292200012132525444030761718750000000000],
       [-0.8460207839190992951117209486255887895822525024414062500000000000,
        -0.7837450054455675774178757819754537194967269897460937500000000000,
        -0.4873360812061122149252412327768979594111442565917968750000000000,
        -0.8943954239042255505154344064067117869853973388671875000000000000,
        -0.6091701529523837477242409477184992283582687377929687500000000000],
       [-0.8273459175674521270948957862856332212686538696289062500000000000,
        -0.7837450052873845551815179533150512725114822387695312500000000000,
         0.0143552745310629691066761637330273515544831752777099609375000000,
        -0.6432356255675323319920266840199474245309829711914062500000000000,
        -0.5363944281245112133404973064898513257503509521484375000000000000],
       [-0.6368037004917077581467310665175318717956542968750000000000000000,
         0.4004102135685200791748172832740237936377525329589843750000000000,
         1.0927025508145216114996856049401685595512390136718750000000000000,
         0.1316298381497438829690338479849742725491523742675781250000000000,
         1.2474325507137644652999597383313812315464019775390625000000000000],
       [ 3.1203980093546443974616977357072755694389343261718750000000000000,
         3.6678869957134567769685418170411139726638793945312500000000000000,
         0.0565560778368175420816044152161339297890663146972656250000000000,
         1.6512396363709753721593642694642767310142517089843750000000000000,
        -0.3720881250099122294905384933372261002659797668457031250000000000],
       [ 0.4967379313976291732579682047798996791243553161621093750000000000,
         1.8921292432687131235269362150575034320354461669921875000000000000,
        -0.7458458426821183984145591239212080836296081542968750000000000000,
         0.8300249230367441333200417830084916204214096069335937500000000000,
        -0.8289938757928638768390783297945745289325714111328125000000000000],
       [-0.8219815082778527681739433319307863712310791015625000000000000000,
        -0.7837450052829718627478428061294835060834884643554687500000000000,
        -0.0160983634108890737157704364790333784185349941253662109375000000,
        -0.8890015852070789481587098634918220341205596923828125000000000000,
        -0.8289938785186626857282021774153690785169601440429687500000000000],
       [ 0.0258575856138216833568499453122058184817433357238769531250000000,
        -0.0712205350006679810404008890145632904022932052612304687500000000,
        -0.4918502348820242286997483915911288931965827941894531250000000000,
        -0.3278262348743137821749371596524724736809730529785156250000000000,
        -0.2028265701860953729163128400614368729293346405029296875000000000],
       [-0.4193911329482012950720104527135845273733139038085937500000000000,
        -0.7837450042984314135807721868332009762525558471679687500000000000,
         1.8898639563230668070303863714798353612422943115234375000000000000,
         1.1576418227991707166069090817472897469997406005859375000000000000,
         4.6738985883387469399963265459518879652023315429687500000000000000],
       [ 3.0616708457542394228312332415953278541564941406250000000000000000,
         2.0170238337050148125229043216677382588386535644531250000000000000,
         2.1792068706079423812127515702741220593452453613281250000000000000,
         0.0156031377921165281819071424251887947320938110351562500000000000,
         0.7345204790501289604520707143819890916347503662109375000000000000],
       [ 0.1479965461447061425559468261781148612499237060546875000000000000,
         0.5896343380718306326215838453208561986684799194335937500000000000,
        -0.5713684202521972510524506105866748839616775512695312500000000000,
         0.4990218509748848840779089641728205606341361999511718750000000000,
         0.0690071570934271405484139449981739744544029235839843750000000000],
       [-0.6488183666006316352437011119036469608545303344726562500000000000,
        -0.7370909339001078253161836073559243232011795043945312500000000000,
        -0.1327179675795511493152645243753795512020587921142578125000000000,
        -0.6821706654414582127188282356655690819025039672851562500000000000,
         0.5146692294781037846362892196339089423418045043945312500000000000],
       [-0.1365323442009934717944474869000259786844253540039062500000000000,
        -0.7837450049114751404744083629339002072811126708984375000000000000,
         0.3818251989468823270890140975097892805933952331542968750000000000,
        -0.1955675179365352900351382459120941348373889923095703125000000000,
         1.3237565845344692050389312498737126588821411132812500000000000000],
       [-0.8460207835339851323297466478834394365549087524414062500000000000,
        -0.6751132952741660808726464892970398068428039550781250000000000000,
         0.0618111874464656527172756739219039445742964744567871093750000000,
        -0.8943954237316867894591609911003615707159042358398437500000000000,
        -0.8289938784738521970751889966777525842189788818359375000000000000],
       [ 2.5412505799998994504562688234727829694747924804687500000000000000,
         5.0519281144669649918910181440878659486770629882812500000000000000,
         2.6956605446958747940300327172735705971717834472656250000000000000,
         1.8332771257582531898577826723339967429637908935546875000000000000,
         0.6058121782944819733174313114432152360677719116210937500000000000],
       [-0.8460207836950575099876914464402943849563598632812500000000000000,
        -0.7837450053437357011532071737747173756361007690429687500000000000,
        -0.1800862665988901545333078502153512090444564819335937500000000000,
        -0.6192911941102438033723842636391054838895797729492187500000000000,
         0.1420951892317803000320708406434278003871440887451171875000000000],
       [-0.4734242305255558225240974934422411024570465087890625000000000000,
        -0.7837450051809308204653348184365313500165939331054687500000000000,
         0.3758334286953769476369302537932526320219039916992187500000000000,
        -0.7829839766307843396120347279065754264593124389648437500000000000,
         0.0646128386684944600037994177910150028765201568603515625000000000],
       [-0.6488183666006316352437011119036469608545303344726562500000000000,
        -0.7370909339001078253161836073559243232011795043945312500000000000,
        -0.1327179675795511493152645243753795512020587921142578125000000000,
        -0.6821706654414582127188282356655690819025039672851562500000000000,
         0.5146692294781037846362892196339089423418045043945312500000000000],
       [-0.8460207831185517779459814846632070839405059814453125000000000000,
        -0.7837450050817008628811777271039318293333053588867187500000000000,
         0.4120268149352634412529994278884259983897209167480468750000000000,
        -0.5093521698243252959770188681432045996189117431640625000000000000,
        -0.8289938777940817349332291996688582003116607666015625000000000000],
       [-0.6488183666006316352437011119036469608545303344726562500000000000,
        -0.7370909339001078253161836073559243232011795043945312500000000000,
        -0.1327179675795511493152645243753795512020587921142578125000000000,
        -0.6821706654414582127188282356655690819025039672851562500000000000,
         0.5146692294781037846362892196339089423418045043945312500000000000],
       [-0.8460207829039355642564146364748012274503707885742187500000000000,
        -0.7837450049841531152239326729613821953535079956054687500000000000,
         0.1603547062552740376517590448202099651098251342773437500000000000,
        -0.4338038380930199222618171006615739315748214721679687500000000000,
         1.5034554656109397896557311469223350286483764648437500000000000000],
       [-0.4193911329482012950720104527135845273733139038085937500000000000,
        -0.7837450042984314135807721868332009762525558471679687500000000000,
         1.8898639563230668070303863714798353612422943115234375000000000000,
         1.1576418227991707166069090817472897469997406005859375000000000000,
         4.6738985883387469399963265459518879652023315429687500000000000000],
       [-0.5305295002105545609794035044615156948566436767578125000000000000,
        -0.7837450053545629291562590879038907587528228759765625000000000000,
        -0.5192822985533931401391782856080681085586547851562500000000000000,
        -0.8943954238145227497724931708944495767354965209960937500000000000,
        -0.3497795350634944067103049292200012132525444030761718750000000000],
       [-0.1251057656688489005958331290457863360643386840820312500000000000,
        -0.5352726664140089463117533341574016958475112915039062500000000000,
        -0.7039362191830014214843913578079082071781158447265625000000000000,
        -0.3446576662685990610768271835695486515760421752929687500000000000,
        -0.3844800585442448848105811975983669981360435485839843750000000000],
       [-0.5369020817073987261736078835383523255586624145507812500000000000,
        -0.7827221190336854927949161719880066812038421630859375000000000000,
        -1.1825533176627007758696663586306385695934295654296875000000000000,
         2.8792405155012388284774260682752355933189392089843750000000000000,
        -0.4180822477015141425127353613788727670907974243164062500000000000],
       [ 0.0236648636289298434920436164929924416355788707733154296875000000,
        -0.7837450049185727962708369886968284845352172851562500000000000000,
         0.5317379248941258573779578000539913773536682128906250000000000000,
        -0.7653336298750414989910950680496171116828918457031250000000000000,
         3.5244176718706072826137187803396955132484436035156250000000000000],
       [ 4.8800294556063201767415193899068981409072875976562500000000000000,
         4.8714774400810814114493041415698826313018798828125000000000000000,
         2.8467717100831682053296844969736412167549133300781250000000000000,
         1.8125311322658428370147021269076503813266754150390625000000000000,
         2.9379434572659919311377052508760243654251098632812500000000000000],
       [-0.8138959182619656118617967877071350812911987304687500000000000000,
        -0.7837450054221178907809530755912419408559799194335937500000000000,
        -0.8933306842504264988491513577173464000225067138671875000000000000,
        -0.8295537358839797370535507070599123835563659667968750000000000000,
        -0.3767077271997943954673360167362261563539505004882812500000000000],
       [ 0.9313964329416570819830667460337281227111816406250000000000000000,
         2.1768409660968162100402878422755748033523559570312500000000000000,
         0.1147829789045585607842880904172488953918218612670898437500000000,
         1.3686575232567745885603471833746880292892456054687500000000000000,
        -0.7577009893512315352737118701043073087930679321289062500000000000],
       [ 3.4209692646135319016309495054883882403373718261718750000000000000,
         1.6022738616542011591548089199932292103767395019531250000000000000,
         2.5308411571786533222905291040660813450813293457031250000000000000,
         0.7663358136011872989712401249562390148639678955078125000000000000,
         1.5521627598155478544583729672012850642204284667968750000000000000],
       [-0.8460207839181261846306369989179074764251708984375000000000000000,
        -0.7837450054451253755871675821254029870033264160156250000000000000,
        -0.8305416395655412786780402711883652955293655395507812500000000000,
        -0.8943954239037895659336641074332874268293380737304687500000000000,
        -0.5241271528255970757470549870049580931663513183593750000000000000],
       [-0.8460207829164639870000996779708657413721084594726562500000000000,
        -0.7837450049898475601395375633728690445423126220703125000000000000,
        -0.1444292099307478982606056661097682081162929534912109375000000000,
        -0.1426531254139287163695826166076585650444030761718750000000000000,
        -0.1498325526705227284107735385987325571477413177490234375000000000],
       [-0.8460207837305517841741675511002540588378906250000000000000000000,
        -0.7837450053598685739686402484949212521314620971679687500000000000,
        -0.4461508532706117691191138874273747205734252929687500000000000000,
        -0.8943954238197524553299899707781150937080383300781250000000000000,
        -0.5609662700930533318910420348402112722396850585937500000000000000],
       [-0.1020456366182153074007032955705653876066207885742187500000000000,
        -0.6259571115663769003134575541480444371700286865234375000000000000,
         0.5127361179294568360731432221655268222093582153320312500000000000,
        -0.4779422573679412145075673379324143752455711364746093750000000000,
         1.0997892496307530851851197439827956259250640869140625000000000000],
       [-0.7953100850661093401683388037781696766614913940429687500000000000,
        -0.7585137581534366901792054704856127500534057617187500000000000000,
        -0.0801628764669323201630390940408688038587570190429687500000000000,
        -0.8327300871891057942875136177462991327047348022460937500000000000,
         1.1777063917799182046763917242060415446758270263671875000000000000],
       [ 0.0258575856138216833568499453122058184817433357238769531250000000,
        -0.0712205350006679810404008890145632904022932052612304687500000000,
        -0.4918502348820242286997483915911288931965827941894531250000000000,
        -0.3278262348743137821749371596524724736809730529785156250000000000,
        -0.2028265701860953729163128400614368729293346405029296875000000000],
       [ 0.4292316107291164195558508254180196672677993774414062500000000000,
        -0.7837450050057792605429085597279481589794158935546875000000000000,
        -0.2374933168213178846794875198611407540738582611083984375000000000,
         0.1499434449435289307128726932205609045922756195068359375000000000,
         0.4859466008123338731650164845632389187812805175781250000000000000],
       [ 0.6568220782822453696070397199946455657482147216796875000000000000,
         2.1587165477446275119177698798011988401412963867187500000000000000,
        -0.3172864137782155369293946023390162736177444458007812500000000000,
         1.0020461334366970174158950612763874232769012451171875000000000000,
         2.1624430790720010620020730129908770322799682617187500000000000000],
       [ 0.9841764345248787959619107823527883738279342651367187500000000000,
         1.2193230334000608738875826020375825464725494384765625000000000000,
         0.3253120834398079419536031764437211677432060241699218750000000000,
        -0.0500379819108021511864237140798650216311216354370117187500000000,
        -0.2682506011678184032476224274432752281427383422851562500000000000],
       [ 4.4802650742716512155539021478034555912017822265625000000000000000,
         4.4901249393718787317197893571574240922927856445312500000000000000,
         1.2781097803925081102249805553583428263664245605468750000000000000,
         2.1118083723495661985225524404086172580718994140625000000000000000,
         0.8991154865433894638471201687934808433055877685546875000000000000],
       [-0.8460207837312709866495197275071404874324798583984375000000000000,
        -0.7837450053601954236270898945804219692945480346679687500000000000,
        -0.1618930807676848826481830201373668387532234191894531250000000000,
        -0.8943954238200746420517361912061460316181182861328125000000000000,
        -0.8289938787966693034547915885923430323600769042968750000000000000],
       [-0.2171196049595259525144541612462489865720272064208984375000000000,
        -0.4645236023303371819537233022856526076793670654296875000000000000,
         1.3510649428708136898791281055309809744358062744140625000000000000,
         0.3042451066909694912254735754686407744884490966796875000000000000,
         1.8433195329406841800334859726717695593833923339843750000000000000],
       [-0.5305295002105545609794035044615156948566436767578125000000000000,
        -0.7837450053545629291562590879038907587528228759765625000000000000,
        -0.5192822985533931401391782856080681085586547851562500000000000000,
        -0.8943954238145227497724931708944495767354965209960937500000000000,
        -0.3497795350634944067103049292200012132525444030761718750000000000],
       [-0.8460207838667667123999649447796400636434555053710937500000000000,
        -0.7616500598262874177635239902883768081665039062500000000000000000,
        -0.3644690133000215159775336815073387697339057922363281250000000000,
        -0.8793266558382332176435625115118455141782760620117187500000000000,
        -0.3399815820253418441332371457974659278988838195800781250000000000],
       [ 4.4802650742716512155539021478034555912017822265625000000000000000,
         4.4901249393718787317197893571574240922927856445312500000000000000,
         1.2781097803925081102249805553583428263664245605468750000000000000,
         2.1118083723495661985225524404086172580718994140625000000000000000,
         0.8991154865433894638471201687934808433055877685546875000000000000],
       [ 0.6521332796919149243919378022837918251752853393554687500000000000,
        -0.4115124068017598779611887493956601247191429138183593750000000000,
         1.1223355965786423471541866092593409121036529541015625000000000000,
         2.7427623393550337738133748644031584262847900390625000000000000000,
         0.8983407581903105620924065988219808787107467651367187500000000000],
       [ 0.3214989902069836924525247923156712204217910766601562500000000000,
         1.1846680286023187900212860768078826367855072021484375000000000000,
         0.1230887324770978952237499015609500929713249206542968750000000000,
         0.4839258769661285985996812541998224332928657531738281250000000000,
         2.2764002559849383366952224605483934283256530761718750000000000000],
       [-0.8460207837305517841741675511002540588378906250000000000000000000,
        -0.7837450053598685739686402484949212521314620971679687500000000000,
        -0.4461508532706117691191138874273747205734252929687500000000000000,
        -0.8943954238197524553299899707781150937080383300781250000000000000,
        -0.5609662700930533318910420348402112722396850585937500000000000000],
       [ 4.1517728075474433779845639946870505809783935546875000000000000000,
         5.0121108230458810695040483551565557718276977539062500000000000000,
         2.3825774956620602296197830582968890666961669921875000000000000000,
        -0.0685845095517776948135235670633846893906593322753906250000000000,
         1.5944995590079562575169802585151046514511108398437500000000000000],
       [-0.8460207837500792749096945044584572315216064453125000000000000000,
        -0.7837450053687442519390060624573379755020141601562500000000000000,
         0.2143742173961561692241417631521471776068210601806640625000000000,
        -0.7992074942294311590273991896538063883781433105468750000000000000,
         0.3050188145134497541555163024895591661334037780761718750000000000],
       [ 0.4462320265488722847990743503032717853784561157226562500000000000,
        -0.7837450018442755350633888156153261661529541015625000000000000000,
         1.4219099186735186801655572708114050328731536865234375000000000000,
         6.4612455876036793966932236799038946628570556640625000000000000000,
         0.7017224933831742728074232218204997479915618896484375000000000000]]
)


random_subset = ad.AnnData(
    X=np.zeros_like(normalized_X),
    obsm={
        "normalized_X": normalized_X,
    }
)

cluster(random_subset, use_rep="normalized_X")

Error output

Versions


@alam-shahul alam-shahul added the Triage 🩺 This issue needs to be triaged by a maintainer label Dec 30, 2024
@flying-sheep
Copy link
Member

Hm, I need to take a closer look at that. I didn‘t try to reproduce it, but logically, it seems impossible, the code’s behavior looks identical to me:

git diff 1.9.3:scanpy/tools/_leiden.py 1.10.4:src/scanpy/tools/_leiden.py

@alam-shahul
Copy link
Author

alam-shahul commented Jan 15, 2025

@flying-sheep I just tested it again and it still seems to be inconsistent between 1.9.3 and 1.10.4... again, this seems like a problematic input, since it doesn't seem to be inconsistent for other mock data.

Another possibility is that the implementation of sc.pp.neighbors has changed...?

@ilan-gold ilan-gold self-assigned this Jan 16, 2025
@ilan-gold ilan-gold removed the Triage 🩺 This issue needs to be triaged by a maintainer label Jan 16, 2025
@ilan-gold ilan-gold added this to the 1.11.0 milestone Jan 16, 2025
@ilan-gold
Copy link
Contributor

@alam-shahul Thanks for the pointers, both the obsp['connectivities'] and obsp['distances'] differ between the implementations.

@ilan-gold
Copy link
Contributor

I think the culprit lie with our random state handling: #2946 (comment)

@keller-mark
Copy link

I suspect (at least for #2946) the culprit is not ScanPy directly but one of its dependencies (specifically, a change in the dependency's internals between two versions). I am not sure which dependency specifically at the moment.

@ilan-gold
Copy link
Contributor

#2536 So this PR caused the change. I'll need to look into why.

@flying-sheep
Copy link
Member

flying-sheep commented Jan 17, 2025

That PR was huge. I did my best to keep it compatible but I’m not surprised that something slipped through.

IIRC think the main difference is that after the PR, sklearn.metrics.pairwise_distances is no longer used in favor of sklearn.neighbors.KNeighborsTransformer. I did write/adapt tests to check if the results are the same, but maybe not enough. I think the most worthwhile investigation would be to check if bringing pairwise_distances back fixes things.

If it does, the ideal fix would be

  1. we’d write a new transformer that uses it under the hood and has its .transform return the dense matrix returned by pairwise_distances (KNeighborsTransformer always returns a sparse matrix).
  2. we use that transformer under the exact circumstances that pairwise_distances was used before.

On the other hand, people have been using the new code for a year, so going back might break their backwards compatibility.

@ilan-gold
Copy link
Contributor

Ok @flying-sheep following your suggestion I was able to get the connectivities to match but not the distances.

I would assume the culprit lies here: https://github.com/scverse/scanpy/pull/2536/files#diff-934eec0a4b88db7c4f7c099d803a25f6b81ca654579bd1ee84fd28b7858b2de2L380-L423

We don't re-get the distances in the new implementation although I am not really familiar with this part of things so it's tough to know exactly. The original umap function function calls the _get_sparse_matrix_from_indices_distances_umap for which I don't see an equivalent in the PR. Indeed bringing this call back as

self._distances = _get_sparse_matrix_from_indices_distances_umap(
    knn_indices, knn_distances, self._adata.shape[0], self.n_neighbors
)

seems to fix things up to 5 decimal places.

But I am super out of my depth here - I don't really understand why it was removed or there in the first place.

Here's the very rough git diff that makes things work with the above example:

diff --git a/src/scanpy/neighbors/__init__.py b/src/scanpy/neighbors/__init__.py
index 21404372..d4e24d07 100644
--- a/src/scanpy/neighbors/__init__.py
+++ b/src/scanpy/neighbors/__init__.py
@@ -10,6 +10,7 @@ from warnings import warn
 import numpy as np
 import scipy
 from scipy.sparse import issparse
+from sklearn.metrics import pairwise_distances
 from sklearn.utils import check_random_state
 
 from .. import _utils
@@ -19,6 +20,7 @@ from .._settings import settings
 from .._utils import NeighborsView, _doc_params, get_literal_vals
 from . import _connectivity
 from ._common import (
+    _get_indices_distances_from_dense_matrix,
     _get_indices_distances_from_sparse_matrix,
     _get_sparse_matrix_from_indices_distances,
 )
@@ -571,8 +573,8 @@ class Neighbors:
         self.n_neighbors = n_neighbors
         self.knn = knn
         X = _choose_representation(self._adata, use_rep=use_rep, n_pcs=n_pcs)
-        self._distances = transformer.fit_transform(X)
-        knn_indices, knn_distances = _get_indices_distances_from_sparse_matrix(
+        self._distances = pairwise_distances(X, metric=metric, **metric_kwds)
+        knn_indices, knn_distances = _get_indices_distances_from_dense_matrix(
             self._distances, n_neighbors
         )
         if shortcut:
@@ -593,14 +595,42 @@ class Neighbors:
                 with contextlib.suppress(Exception):
                     self._rp_forest = _make_forest_dict(index)
         start_connect = logg.debug("computed neighbors", time=start_neighbors)
-
         if method == "umap":
+
+            def _get_sparse_matrix_from_indices_distances_umap(
+                knn_indices, knn_dists, n_obs, n_neighbors
+            ):
+                rows = np.zeros((n_obs * n_neighbors), dtype=np.int64)
+                cols = np.zeros((n_obs * n_neighbors), dtype=np.int64)
+                vals = np.zeros((n_obs * n_neighbors), dtype=np.float64)
+
+                for i in range(knn_indices.shape[0]):
+                    for j in range(n_neighbors):
+                        if knn_indices[i, j] == -1:
+                            continue  # We didn't get the full knn for i
+                        if knn_indices[i, j] == i:
+                            val = 0.0
+                        else:
+                            val = knn_dists[i, j]
+
+                        rows[i * n_neighbors + j] = i
+                        cols[i * n_neighbors + j] = knn_indices[i, j]
+                        vals[i * n_neighbors + j] = val
+                import scipy.sparse as sp
+
+                result = sp.coo_matrix((vals, (rows, cols)), shape=(n_obs, n_obs))
+                result.eliminate_zeros()
+                return result.tocsr()
+
             self._connectivities = _connectivity.umap(
                 knn_indices,
                 knn_distances,
                 n_obs=self._adata.shape[0],
                 n_neighbors=self.n_neighbors,
             )
+            self._distances = _get_sparse_matrix_from_indices_distances_umap(
+                knn_indices, knn_distances, self._adata.shape[0], self.n_neighbors
+            )
         elif method == "gauss":
             self._connectivities = _connectivity.gauss(
                 self._distances, self.n_neighbors, knn=self.knn

As for what we should do about this, I am also a bit at a loss. I would tend towards making a public announcement and going back to the original behavior. Certainly, something should be done, or at least stated.

@flying-sheep
Copy link
Member

flying-sheep commented Jan 17, 2025

def _get_sparse_matrix_from_indices_distances_umap(...): ...

OK, let’s think about this. What it does should already happen here:

if knn: # remove too far away entries in self._distances
self._distances = _get_sparse_matrix_from_indices_distances(
knn_indices, knn_distances, keep_self=False
)

Indeed: why and how does _get_sparse_matrix_from_indices_distances work differently than the …_umap version?

  • Does the -1 business come into play? I don’t think so
  • Do we handle the i==j case? I think so, that’s what keep_self=False above for

@ilan-gold
Copy link
Contributor

I think _get_sparse_matrix_from_indices_distances and _get_sparse_matrix_from_indices_distances_umap have different implementations. Are you saying that _get_sparse_matrix_from_indices_distances_umap's functionality has been perfectly recaptured in _get_sparse_matrix_from_indices_distances?

I am also curious as to why was _get_sparse_matrix_from_indices_distances_umap was removed?

@flying-sheep
Copy link
Member

Are you saying that _get_…_umap's functionality has been perfectly recaptured in _get_…?

The umap version also deals with -1s which we never feed in. Otherwise I can’t see a difference other than the umap version being dog slow. That’s why it was removed: it’s a slower version of something we have with features we don’t use.

But you’re saying switching it out makes a difference so seems like I’m wrong. I just wonder what I’m not seeing

@ilan-gold
Copy link
Contributor

Ok great to know. I'll have a closer look then.

@flying-sheep
Copy link
Member

flying-sheep commented Jan 20, 2025

Hm, I don’t think I can do much right now, I’m blocked by lmcinnes/pynndescent#250

/edit: nevermind, I found a workaround: lmcinnes/pynndescent#250 (comment)

@flying-sheep
Copy link
Member

flying-sheep commented Jan 20, 2025

Hmm, if we look at the number of distances per knn matrix row, we get this: (the entries <19 are just identical rows, i.e. distance 0, so they get eliminated in the old version of the code)

dists_transformer.getnnz(1) = array([
       19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
       19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
       19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
       19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
       19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19,
       19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19],
      dtype=int32)
dists_pairwise.getnnz(1) = array([
       19, 17, 19, 17, 18, 19, 18, 19, 16, 19, 19, 19, 18, 19, 18, 18, 19,
       19, 18, 18, 18, 18, 19, 15, 19, 19, 18, 18, 19, 19, 19, 17, 18, 19,
       18, 17, 18, 19, 17, 19, 18, 18, 19, 18, 19, 15, 18, 18, 19, 16, 19,
       18, 18, 19, 19, 19, 18, 18, 19, 18, 15, 18, 18, 18, 19, 18, 15, 18,
       15, 18, 18, 16, 19, 19, 18, 19, 19, 19, 18, 18, 18, 18, 19, 18, 18,
       19, 19, 19, 19, 19, 19, 16, 18, 19, 17, 19, 18, 19, 19, 19],
      dtype=int32)

the result is a high variance in columns being different between the two versions (up to 12 different distances), but the rows are very close (mostly 0–2, sometimes 4 different distances)

diff.eliminate_zeros()
diff.getnnz(axis=0) = array([
       0,  0,  0,  0,  2,  0,  1,  0,  9,  0,  0,  1,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  2,  0,  6,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
       0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  2,  0,  0,  0,  2,  0,
       1,  2,  0,  0,  0,  0,  0,  0,  0,  6,  0,  0,  0,  0,  2,  6,  0,
       4,  0,  0,  3,  0,  0,  0,  0,  1,  0,  0,  0,  0,  1,  0,  0,  0,
       0,  0,  0,  0,  0,  0, 12,  0,  0,  0,  0,  1,  0,  0,  0])
diff.getnnz(axis=1) = array([
       0, 0, 0, 0, 2, 0, 2, 2, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 2,
       0, 2, 0, 4, 4, 0, 0, 0, 0, 0, 0, 4, 2, 0, 2, 0, 0, 0, 2, 0, 0, 0,
       2, 2, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 4, 2, 0, 2, 0, 0, 2,
       2, 0, 2, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0])

as a particularly egregious example, here’s column 91 (h) and row 91 (v), respectively (np.vstack([trans.toarray()[:, 91], pair.toarray()[:, 91], trans.toarray()[91, :], pair.toarray()[91, :]]).

As you can see, it varies a lot in which knn lists sample 91 is.

Diff details
             h                     v          
   transformer  pairwise transformer  pairwise
0     0.000000  0.000000    0.000000  0.000000
1     0.000000  0.000000    0.000000  0.000000
2     0.000000  0.000000    0.000000  0.000000
3     0.000000  0.000000    0.000000  0.000000
4     0.994297  0.000000    0.000000  0.000000
5     0.000000  0.000000    0.000000  0.000000
6     0.686131  0.000000    0.686131  0.686131
7     0.576130  0.576130    0.576130  0.576130
8     0.000000  0.000000    0.000000  0.000000
9     0.000000  0.000000    0.000000  0.000000
10    0.000000  0.838138    0.000000  0.000000
11    0.474487  0.474487    0.474487  0.474487
12    0.352581  0.352581    0.352581  0.352581
13    0.000000  0.000000    0.000000  0.000000
14    2.159415  2.159415    0.000000  0.000000
15    2.159415  2.159415    0.000000  0.000000
16    0.000000  0.000000    0.000000  0.000000
17    0.000000  0.000000    0.000000  0.000000
18    0.000000  0.000000    0.000000  0.000000
19    0.000000  0.000000    0.000000  0.000000
20    0.000000  0.000000    0.000000  0.000000
21    0.000000  0.000000    0.000000  0.000000
22    0.000000  0.000000    0.000000  0.000000
23    0.000000  0.978730    0.000000  0.000000
24    0.625874  0.625874    0.625874  0.625874
25    0.000000  0.000000    0.000000  0.000000
26    0.000000  0.000000    0.000000  0.000000
27    0.476250  0.476250    0.476250  0.476250
28    0.000000  0.000000    0.000000  0.000000
29    0.000000  0.000000    0.000000  0.000000
30    0.000000  0.000000    0.000000  0.000000
31    0.000000  0.000000    0.000000  0.000000
32    0.000000  0.000000    0.000000  0.000000
33    0.000000  0.000000    0.000000  0.000000
34    0.000000  0.823803    0.000000  0.000000
35    0.000000  0.000000    0.000000  0.000000
36    0.919318  0.919318    0.000000  0.000000
37    0.684217  0.684217    0.684217  0.684217
38    0.000000  0.000000    0.000000  0.000000
39    0.000000  0.000000    0.000000  0.000000
40    0.000000  0.000000    0.000000  0.000000
41    0.000000  0.000000    0.000000  0.000000
42    0.000000  0.000000    0.000000  0.000000
43    0.000000  0.000000    0.000000  0.000000
44    0.520122  0.520122    0.520122  0.520122
45    0.000000  0.978730    0.000000  0.000000
46    0.000000  0.000000    0.000000  0.000000
47    0.000000  0.000000    0.000000  0.000000
48    0.000000  0.000000    0.000000  0.000000
49    0.000000  0.000000    0.000000  0.000000
50    0.409681  0.409681    0.409681  0.409681
51    0.686131  0.000000    0.686131  0.686131
52    0.000000  0.000000    0.000000  0.000000
53    0.000000  0.000000    0.000000  0.000000
54    0.000000  0.000000    0.000000  0.000000
55    0.000000  0.000000    0.000000  0.000000
56    1.077314  1.077314    0.000000  0.000000
57    0.000000  0.000000    0.000000  0.000000
58    0.000000  0.000000    0.000000  0.000000
59    0.000000  0.000000    0.000000  0.000000
60    0.000000  0.978730    0.000000  0.000000
61    0.000000  0.000000    0.000000  0.000000
62    0.000000  0.823803    0.000000  0.000000
63    0.000000  0.000000    0.000000  0.000000
64    0.729528  0.729528    0.000000  0.000000
65    0.994297  0.000000    0.000000  0.000000
66    0.000000  0.978730    0.000000  0.000000
67    0.000000  0.000000    0.000000  0.000000
68    0.000000  0.978730    0.000000  0.000000
69    0.000000  0.000000    0.000000  0.000000
70    0.000000  0.000000    0.000000  0.000000
71    0.000000  0.000000    0.000000  0.000000
72    0.750746  0.750746    0.000000  0.000000
73    0.000000  0.000000    0.000000  0.000000
74    0.000000  0.000000    0.000000  0.000000
75    0.000000  0.000000    0.000000  0.000000
76    0.474487  0.474487    0.474487  0.474487
77    0.000000  0.000000    0.000000  0.000000
78    0.000000  0.000000    0.000000  0.000000
79    0.476250  0.476250    0.476250  0.476250
80    0.919318  0.919318    0.000000  0.000000
81    0.386630  0.386630    0.386630  0.386630
82    0.000000  0.000000    0.000000  0.000000
83    0.000000  0.000000    0.000000  0.000000
84    1.077314  1.077314    0.000000  0.000000
85    0.000000  0.000000    0.000000  0.000000
86    0.000000  0.000000    0.000000  0.000000
87    0.000000  0.000000    0.000000  0.000000
88    0.000000  0.000000    0.000000  0.000000
89    0.675950  0.675950    0.675950  0.675950
90    0.000000  0.000000    0.000000  0.000000
91    0.000000  0.000000    0.000000  0.000000
92    0.352581  0.352581    0.352581  0.352581
93    0.000000  0.000000    0.000000  0.000000
94    0.000000  0.000000    0.000000  0.000000
95    0.000000  0.000000    0.000000  0.000000
96    0.386630  0.386630    0.386630  0.386630
97    0.000000  0.000000    0.000000  0.000000
98    0.000000  0.000000    0.000000  0.000000
99    0.000000  0.000000    0.000000  0.000000
nnz_transformer = [ 4,  6,  7, 11, 12, 14, 15, 24, 27, 36, 37, 44, 50, 51, 56, 64, 65, 72, 76, 79, 80, 81, 84, 89, 92, 96]
nnz_pairwise    = [ 7, 10, 11, 12, 14, 15, 23, 24, 27, 34, 36, 37, 44, 45, 50, 56, 60, 62, ..., 68, 72, 76, 79, 80, 81, 84, 89, 92, 96]

@flying-sheep
Copy link
Member

flying-sheep commented Jan 20, 2025

hmm, I think the new code just has better support for duplicate data rows actually …

@ilan-gold
Copy link
Contributor

@alam-shahul @keller-mark Could you both try out this #3444 in your respective use cases and report back? I believe this should help. @alam-shahul your distances/connectivities match with good tolerance and the leiden results appear to be completely identical, although I don't have access to the "full" data.

@flying-sheep
Copy link
Member

flying-sheep commented Jan 21, 2025

to be clear: please try running neighbors(..., transformer="sklearn-pairwise")

we won’t restore full backwards compatibility because of downsides, but we will add an option to get the old results.

@keller-mark
Copy link

keller-mark commented Jan 21, 2025

Thanks @flying-sheep and @ilan-gold . I don't think this solves the issue with #2946 . The image comparison tests in #3446 are still failing in the python 3.10 + min-dependencies environment (but not a python 3.12 + normal environment).

I think this points to the issue in #2946 being with a change in how computations are performed between versions of one or more dependencies and therefore despite the same random seed, the end results diverge.

#3446 (comment)

I am not sure that we should expect random results, even with a seed set, to be identical in different environments.

@flying-sheep
Copy link
Member

flying-sheep commented Jan 21, 2025

I am not sure that we should expect random results, even with a seed set, to be identical in different environments.

Especially since numpy’s recommended random number API has no stability guarantee. So going forward, we can only have one if people pin a bunch of other dependencies, which means that I don’t feel like it will forever be worth the effort to have a stability guarantee in scanpy.

@keller-mark
Copy link

Right, I think one could only guarantee stability given some version of scanpy and a fully pinned set of dependencies.

@flying-sheep
Copy link
Member

I should have checked better. I wasted so much time on this.

We don’t support duplicated data. To not completely waste a week of debugging this, here the final results:

On macOS ARM64, using pairwise_distances on duplicated data actually happens to create stable results.
So despite not being supported, it’s possible to recreate your results exactly if you‘re on macOS ARM64.

But instead, you should deduplicate your data or otherwise deal with duplicates.

Here is the code. specify transformer=PairwiseDistancesTransformer(n_neighbors=20) or so to neighbors to use it

from __future__ import annotations

from dataclasses import KW_ONLY, dataclass
from typing import TYPE_CHECKING

import numpy as np
from sklearn.base import TransformerMixin

from .._common import (
    _get_indices_distances_from_dense_matrix,
    _get_sparse_matrix_from_indices_distances,
)

if TYPE_CHECKING:
    from collections.abc import Mapping
    from typing import Literal, Self

    from numpy.typing import NDArray

    from ..._utils import _CSMatrix

    _Metric = Literal["cityblock", "cosine", "euclidean", "l1", "l2", "manhattan"]
    _MatrixLike = NDArray | _CSMatrix


@dataclass
class PairwiseDistancesTransformer(TransformerMixin):
    _: KW_ONLY
    algorithm: Literal["brute"]
    n_jobs: int
    n_neighbors: int
    metric: _Metric
    metric_params: Mapping[str, object]

    def fit(self, x: _MatrixLike) -> Self:
        self.x_ = x
        return self

    def transform(self, y: _MatrixLike | None) -> _CSMatrix:
        from sklearn.metrics import pairwise_distances

        d_arr = pairwise_distances(self.x_, y, metric=self.metric, **self.metric_params)
        ind, dist = _get_indices_distances_from_dense_matrix(
            d_arr, self.n_neighbors + 1
        )
        rv = _get_sparse_matrix_from_indices_distances(ind, dist, keep_self=True)
        return rv

@flying-sheep flying-sheep closed this as not planned Won't fix, can't repro, duplicate, stale Jan 24, 2025
@alam-shahul
Copy link
Author

alam-shahul commented Jan 24, 2025

Thanks so much, sorry about taking up your time. This is useful, I'll use this to ensure reproducibility in my own code.

I didn't realize there was duplicated data here. Indeed, this is some simulated data, so I could see there being duplicates (although I'm a bit surprised, since it should be randomly generated). I also found that the clustering results changed when applied to real data, which I wouldn't have thought would have duplicates, so I'll have to check that.

Maybe it would be possible to add a documentation note that duplicated rows are not supported?

@flying-sheep
Copy link
Member

Good point, but no idea where we could add a note about this. Very few libraries are capable of dealing with duplicates, so this was always implicitly true for basically everything we do, especially kNN.

After all, the following question has no answer: “given two identical rows a and b, and a different row c, is a or b closer to c?”

I also found that the clustering results changed when applied to real data, which I wouldn't have thought would have duplicates, so I'll have to check that.

this is not the first time someone had duplicates in “real” data, it can happen for multiple reasons, such as measurement artifacts, all-0 rows, subsampling-with-replacement, …

@keller-mark keller-mark mentioned this issue Jan 24, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants