好文档 - 专业文书写作范文服务资料分享网站

Chen_Steady-State_Non-Line-Of-Sight_Imaging_CVPR_2019_paper

天下 分享 时间: 加入收藏 我要投稿 点赞

Steady-stateNon-Line-of-SightImaging

WenzhengChen1,2*SimonDaneau1,3FahimMannan1FelixHeide1,4

1

Algolux

2

UniversityofToronto

3

Universit′edeMontr′eal

4

PrincetonUniversity

Abstract

Conventionalintensitycamerasrecoverobjectsinthedi-rectline-of-sightofthecamera,whereasoccludedscene

partsareconsideredlostinthisprocess.Non-line-of-sightimaging(NLOS)aimsatrecoveringtheseoccludedobjectsbyanalyzingtheirindirectre?ectionsonvisiblescenesur-faces.ExistingNLOSmethodstemporallyprobetheindirectlighttransporttounmixlightpathsbasedontheirtraveltime,whichmandatesspecializedinstrumentationthatsuf-fersfromlowphotonef?ciency,highcost,andmechanicalscanning.Wedepartfromtemporalprobinganddemon-stratesteady-stateNLOSimagingusingconventionalin-tensitysensorsandcontinuousillumination.Insteadofas-sumingperfectlyisotropicscattering,theproposedmethodexploitsdirectionalityinthehiddensurfacere?ectance,resultingin(small)spatialvariationoftheirindirectre-?ectionsforvaryingillumination.Totackletheshape-dependenceofthesevariations,weproposeatrainablear-chitecturewhichlearnstomapdiffuseindirectre?ectionstoscenere?ectanceusingonlysynthetictrainingdata.Rely-ingonconsumercolorimagesensors,withhigh?llfactor,highquantumef?ciencyandlowread-outnoise,wedemon-stratehigh-?delitycolorNLOSimagingforscenecon?gu-rationstackledbeforewithpicosecondtimeresolution.

1.Introduction

Recoveringobjectsfromconventionalmonocularim-ageryrepresentsacentralchallengeincomputervision,withalargebodyofworkonsensingtechniquesusingcontrolledilluminationwithspatial[50,41]ortemporalcoding[32,24,19,39],multi-viewreconstructionmeth-ods[18],sensingviacodedoptics[47],andrecentlylearnedreconstructionmethodsusingsingle-viewmonocularim-ages[49,11,16].Whilethesesensingmethodsdriveapplicationsacrossdomains,includingautonomousvehi-cles,robotics,augmentedreality,anddatasetacquisitionforsceneunderstanding[52],theyonlyrecoverobjectsinthedirectline-of-sightofthecamera.Thisisbecauseob-*The

majorityofthisworkwasdonewhileinterningatAlgolux.

Figure1:Wedemonstratethatitispossibletoimageoccluded

objectsoutsidethedirectline-of-sightusingcontinuousillumina-tionandconventionalcameras,withouttemporalsampling.Wesparselyscanadiffusewallwithabeamofwhitelightandrecon-struct“hidden”objectsonlyfromspatialvariationsinsteady-stateindirectre?ections.

jectsoutsidetheline-of-sightonlycontributetoameasure-mentthroughindirectre?ectionsviavisiblediffuseobjectsurfaces.Thesere?ectionsareextremelyweakduetothemultiplescattering,andtheylose(most)angularinforma-tiononthediffusescenesurface(asopposedtoamirrorsurfaceinthescene).NLOSimagingaimsatrecoveringobjectsoutsideacamera’sline-of-sightfromtheseindirectlighttransportcomponents.

Totacklethelackofangularresolution,anumberofNLOSapproacheshavebeendescribedthattemporallyprobethelight-transportinthescene,therebyunmixinglightpathcontributionsbytheiropticalpathlength[1,30,36,43]andeffectivelytradingangularwithtemporalres-olution.Toacquiretemporallyresolvedimagesoflighttransport,existingmethodseitherdirectlysamplethetem-poralimpulseresponseofthescenebyrecordingthetem-poralechoesoflaserpulses[54,43,17,7,53,3,42],ortheyuseamplitude-codedilluminationandtime-of-?ightsensors[21,26,25].Whileamplitudecodingapproaches6790

1sufferfromlowtemporalresolutionduetosensordemod-ulationbandwidthlimitations[32]andthecorrespondingill-posedinverseproblem[19],directprobingmethodsachievehightemporalresolutionalreadyintheacquisitionphase,butinturnrequireultra-shortpulsedlaserillumi-nationanddetectorswith<10pstemporalresolutionformacroscopicscenes.Thismandatesinstrumentationwithhightemporalresolution,thatsuffersfromseverepracti-callimitationsincludinglowphotonef?ciency,largemea-surementvolumes,high-resolutiontimingelectronics,ex-cessivecostandmonochromaticacquisition.Earlystreak-camerasetups[54]hencerequirehoursofacquisitiontime,and,whileemergingsinglephotonavalancediode(SPAD)detectors[7,42]aresensitivetoindividualphotons,theyareinfactphoton-inef?cient(diffuseexperimentsin[42])duetoverylow?llfactorsandpileupdistortionsathigherpulsepower.Toovercomethisissuewithoutexcessiveintegrationtimes,recentapproaches[42,20]restrictthescenetoretro-re?ectivematerialsurfaces,whicheliminatesquadraticfallofffromthesesurfaces,buteffectivelyalsoconstrainspracticalusetoasingleobjectclass.

Inthiswork,wedemonstratethatitispossibletoim-ageobjectsoutsideofthedirectline-of-sightusingconven-tionalintensitysensorsandcontinuousillumination,with-outtemporalcoding.Incontrasttopreviousmethods,thatassumeperfectlyisotropicre?ectance,theproposedmethodexploitsdirectionalityofthehiddenobject’sre?ectance,re-sultinginspatialvariationoftheindirectre?ectionsforvaryingillumination.Tohandletheshape-dependenceofthesevariations,welearnadeepmodeltrainedusingatrain-ingcorpusofsimulatedindirectrenderings.Byrelyingonconsumercolorimagesensors,withhigh?llfactor,highquantumef?ciencyandlowread-outnoise,wedemonstratefull-colorNLOSimagingatfastimagingratesandinsetupscenariosidenticaltothosetackledbyrecentpulsedsystemswithpicosecondresolution.

Speci?cally,wemakethefollowingcontributions:?Weformulateanimageformationmodelforsteady-stateNLOSimagingandanef?cientimplementationwithoutray-tracing.Basedonthismodel,wederiveanoptimizationmethodforthespecialcaseofplanarsceneswithknownre?ectance.

?Weproposealearnablearchitectureforsteady-stateNLOSimagingforrepresentativeobjectclasses.?Wevalidatetheproposedmethodinsimulation,andexperimentallyusingsetupandscenespeci?cationsidenticaltotheonesusedinprevioustime-resolvedmethods.Wedemonstratethatthemethodgeneralizesacrossobjectswithdifferentre?ectanceandshapes.?Weintroduceasynthetictrainingsetforsteady-stateNLOSimaging.Thedatasetandmodelswillbepub-lishedforfullreproducibility.

2.RelatedWork

TransientImagingKirmanietal.[30]?rstproposedtheconceptofrecovering“hidden”objectsoutsideacamera’sdirectline-of-sightusingtemporallyresolvedlighttrans-portmeasurementsinwhichshortpulsesoflightarecap-tured“in?ight”beforetheglobaltransportreachesasteadystate.Thesetransientmeasurementsarethetemporalim-pulseresponseoflighttransportinthescene.Abram-son[1]?rstdemonstratedaholographiccapturesystemfortransientimaging,andVeltenetal.[55]showedthe?rstexperimentalNLOSimagingresultsusingafemto-secondlaserandstreakcamerasystem.Sincetheseseminalworks,agrowingbodyofworkhasbeenexploringtransientimagingwithafocusonenablingimprovedNLOSimag-ing[43,36,56,17,21,19,7,38].

ImpulseNon-Line-of-Sight-ImagingAmajorlineofre-search[43,54,17,42,53,3,45,40,58]proposestoac-quiretransientimagesdirectly,bysendingpulsesoflightintothesceneandcapturingtheresponsewithdetectorsca-pableofhightemporalsampling.WhilethestreakcamerasetupfromVeltenetal.[55]allowsfortemporalprecisionof<10ps,correspondingtoapathlengthof3mm,thehighinstrumentationcostandsensitivityhassparkedworkonsinglephotonavalanchediodes(SPADs)asadetectoralternative[7,40].Recently,O’Tooleetal.[40]proposescannedSPADcapturesetupthatallowsforcomputationalef?ciencybymodelingtransportasashift-invariantcon-volution.AlthoughSPADdetectorscanoffercomparableresolution<10ps[37],theytypicallysufferfromlow?llfactorstypicallyaroundafewpercent[44]andlowspatialresolutioninthekilo-pixelrange[35].Comparedtoubiq-uitousintensityimagesensorswith>10megapixelreso-lution,currentSPADsensorsarestill?veordersofmagni-tudemorecostly,andtwoordersofmagnitudelessphoton-ef?cient.

ModulatedandCoherentNon-Line-of-Sight-ImagingAsanalternativetoimpulse-basedacquisition,correlationtime-of-?ightsetupshavebeenproposed[19,25,21,26]whichencodetravel-timeindirectlyinasequenceofphasemeasurements.Whilecorrelationtime-of-?ightcamerasarereadilyavailable,e.g.Microsoft’sKinectOne,theirap-plicationtotransientimagingislimitedduetoamplitudemodulationbandwidthsaround100MHz,andhencetem-poralresolutioninthenanosecondrange.Afurtherlineofwork[29,28]exploresusingcorrelationsinthecarrierwaveitself,insteadofamplitudemodulation.Whilethisap-proachallowsforsingle-shotNLOScaptures,itislimitedtoscenesatmicrosopicscales[28].

TrackingandClassi?cationMostsimilartotheproposedmethodarerecentapproachesthatuseconventionalinten-sitymeasurementsforNLOSvisiontasks[31,8,9,5].Al-6791

thoughnotrequiringtemporalresolution,theseexistingap-proachesarerestrictedtocoarselocalizationandclassi?-cationtoalimitedextent,incontrasttofullimagingandgeometryreconstructionapplications.

3.ImageFormationModel

Non-line-of-sightimagingmethodsrecoverobjectprop-ertiesoutsidethedirectline-of-sightfromthird-orderbounces.Typically,adiffusewallpatchinthedirectline-of-sightisilluminated,wherethelightthenscattersandpartiallyreachesahiddenobjectoutsidethedirectline-of-sight.Attheobjectsurface,thescatteredlightisre-?ectedbacktothevisiblewallwhereitmaybemeasured.Incontrasttoexistingmethodsthatrelyontemporallyre-solvedtransport,theproposedmethodusesstationarythird-bouncetransport,i.e.withouttimeinformation,torecoverre?ectanceandgeometryofthehiddensceneobjects.

3.1.StationaryLightTransport

SpecializingtheRenderingEquation[27]tonon-line-of-sightimaging,wemodeltheradianceLatapositionwonthewallas

L(w)=??

ρ(x?l,w?x)(n11

x·(x?l))?

r2xwr2L(l)dx

xl+δ(??l?w??)L(l),

(1)

withx,nxthepositionandcorrespondingnormalontheobjectsurface?,lbeingagivenbeampositiononthewall,andρdenotingthebi-directionalre?ectancedistri-butionfunction(BRDF).Thisimageformationmodelas-sumesthreeindirectbounces,withthedistancefunctionrmodelingintensityfalloffbetweeninputpositions,andonedirectbounce,whenlandwareidenticalintheDiracdeltafunctionδ(·),anditignoresocclusionsinthesceneoutsidetheline-of-sight.WemodeltheBRDFwithadiffuseandspeculartermas

ρ(ωi,ωo)=αdρd(ωi,ωo)+αsρs(ωi,ωo).

(2)

Thediffusecomponentρdmodelslightscattering,re-sultinginalmostorientation-independentlow-passre?ec-tionswithouttemporallycodedillumination.Incon-trast,thespecularre?ectancecomponentρscontributeshigh-frequencyspecularhighlights,i.e.mirror-re?ectionsblurredbyaspecularlobe.Thesetwocomponentsaremixedwithadiffusealbedoαdandspecularalbedoαs.Whilethespatialandcolordistributionsofthesetwoalbedocomponentscanvary,theyareoftencorrelatedforob-jectscomposedofdifferentmaterials,changingonlyattheboundariesofmaterialsonthesamesurface.Althoughtheproposedmethodisnotrestrictedtoaspeci?cBRDFmodel,weadoptaPhongmodel[46]inthefollowing.

3.2.SensorModel

Weuseaconventionalcolorcamerainthiswork.WemodeltherawsensorreadingswiththePoisson-GaussiannoisemodelfromFoietal.[15,14]assamples

b~1

κP

????T??W??

?L(w)dωdwdtκA

E??+N(0,σ2),(3)

whereweintegrateEq.(1)overthesolidangle?Aofthecamera’saperture,overspatialpositionWthatthegivenpixelmapsto,andexposuretimeT,resultingintheincidentphotonswhendividedbythephotonenergyE.Thesensormeasurementbatthegivenpixelisthenmodeledwiththeparametersκ>0andσ>0inaPoissonandGaussiandistribution,respectively,accuratelyre?ectingtheeffectsofanaloggain,quantumef?ciencyandreadoutnoise.Forno-tationalbrevity,wehavenotincludedsub-samplingonthecolor?lterarrayofthesensor.

4.InverseIndirectTransportforPlanarScenes

Inthissection,weaddressthespecialcaseofplanarob-jects.Assumingplanarscenesinthehiddenvolumeal-lowsustorecoverre?ectanceand3Dgeometryfromin-directre?ections.Moreover,inthiscase,wecanformu-latethecorrespondinginverseproblemusingef?cientopti-mizationmethodswithanalyticgradients.Intheremainderofthispaper,weassumethattheshapeandre?ectanceofthedirectlyvisiblescenepartsareknown,i.e.thevisiblewallarea.Theproposedhardwaresetupallowsforhigh-frequencyspatiallycodedillumination,andhencethewallgeometrycanbeestimatedusingestablishedstructured-lightmethods[50].Illuminatingapatchlonthevisiblewall,ahiddenplanarscenesurfaceproducesadiffuselow-frequencyre?ectioncomponent,encodingtheprojectedpo-sitionindependentlyoftheorientation[31],andhigher-frequencyspecularre?ectioncomponentsoftheblurredspecularalbedomappedtoorientation-dependentpositionsonthewall.Assumingasinglepointlightsourceatlonthewall,seeFig.2,thespeculardirectionataplanepointpisthemirrordirectionr=(p?l)?2((p?l)·n)nwiththeplanenormalbeingn.Thecenterofthespecularlobeconthewallisthemirrorpointofl,i.e.theintersectionofthere?ectedrayindirectionrwiththewall.Conversely,ifwedetectaspecularlobearoundcinameasurement,wecansolveforthecorrespondingplanepointas

p(v,n)=c+((v?c)·n)??

n?

v?l?((c?l)·n)n

n·(2v?c?l)

(4)

??,thatisafunctionoftheplanarsurfacerepresentedbyitsnormalnandapointvontheplane.Eq.(4)followsimme-diatelyfromtheconstraintthattheorthogonalprojectionsofthepointslandcontotheplaneresultinequaltriangles

6792

6793

Figure3:Experimentalgeometryandalbedoreconstructionsforthespecialcaseofplanarobjects,capturedwiththeprotoypefrom

Sec.7.2andsetupgeometryfrom[40].Wedemonstratereconstructionsforthreedifferentsurfacematerials.The?rstrowshowsanobjectwithdiamondgraderetrore?ectivesurfacecoatingastheyarefoundonnumberplatesandhigh-qualitystreetsigns,identicaltotheobjectsin[40],whichsurprisinglycontainfaintspecularcomponentsvisibleinthemeasurements(pleasezoomintotheelectronicversionofthisdocument).Thesecondandthirdrowsshowaconventionallypaintedroadsignandanengineering-gradestreetsign.Theproposedmethodrunsataroundtwosecondsincludingcaptureandreconstruction,andachieveshighresolutionresultswithouttemporalsampling.

projectinglightbeamstodifferentpositionsonthewallre-sultsindifferentobservationswhichwedubindirectre?ec-tionmaps,i.e.indirectcomponentoftheimageonthewallwithoutthedirectre?ection.Eachmapcontainsinforma-tionabouttheobjectshapeandnormalinformationinspe-ci?cdirectioniftheBRDFisangle-dependent.NotethatthisisnotonlythecaseforhighlyspecularBRDFs,butalsoforlambertianBRDFsduetoforeshorteningandvaryingalbedo.Hence,bychangingthebeampositionweacquirevariationalinformationaboutshapeandre?ectance.

Assuminglocallysmoothobjectsurfaces,wesampletheavailablewallareauniformlyina5×5gridandacquiremultipleindirectre?ectionmaps.Westackallthecapturedimages,formingah×w×(5·5·3)dimensiontensorasnetworkinput.Thevirtualsourcepositionisafurtherimportantinformationthatmaybeprovidedtothenetwork.However,sinceweuseuniformdeterministicsampling,wefoundthatthemodellearnsthisstructuredinformation,incontrasttorandomsourcesampling.

Weusetheorthogonalviewofthesceneasourgroundtruthlatentvariable,asifthecamerahadbeenplacedinthecenterofthevisiblewallinwallnormaldirectionandwithambientilluminationpresent.Giventhestackofindirectre?ectionmaps,theproposednetworkistrainedtoestimate

thecorrespondingorthogonalviewintothehiddenscene.NetworkArchitectureWeproposeavariantoftheU-Netarchitecture[48]asournetworkbackbonestructure,showninFig.4.Itcontainsa8layersencoderanddecoder.Eachencoderlayerreducestheimagesizebyafactoroftwoineachdimensionanddoublesthefeaturechannel.Thisscalingisrepeateduntilweretrievea1024dimensionla-tentvector.Incorrespondingconvolutionanddeconvolu-tionlayerpairswiththesamesize,weconcatenatethemtolearnresidualinformation.

LossfunctionsWeuseamulti-scale?2lossfunction

??

Vmulti?scale=γk??ik?ok??2,(8)

k

whereiisthepredictednetworkoutputandoistheground-truthorthogonalimage.Here,krepresentsdifferentscales

andγkisthecorrespondingweightofthatlayer.Withfea-turemapatk-thelayer,weadoptanextraonedeconvolutionlayertoconvertthefeaturetoanestimateatthetargetres-olution.Wepredict64×64,128×128and256×256groundtruthimagesandsettheweightsγkas0.6,0.8and1.0.SeetheSupplementalMaterialfortrainingdetails.

6794

1um8i3jy490weks4q8jb3z01x0bvw200n8d
领取福利

微信扫码领取福利

微信扫码分享