Difference between revisions of "Duplicate File Finder"

From Baranoski.ca
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
Originally from [https://stackoverflow.com/questions/76242708/find-duplicate-files-with-hash-and-length-but-use-other-algorithm Reddit] but I added a .bat wrapper
+
Originally from [https://superuser.com/questions/1315365/how-can-i-generate-an-md5-sum-for-a-folder-on-windows Superuser.com]
  
==duplicate.ps1==
+
==getsums.bat==
 
<PRE>
 
<PRE>
$srcDir = 'C:\test'
+
@echo off
$maxThreads = 6 # Tweak this value for more or less threads
+
for /R . %%f in (*.*) do (
$rs = [runspacefactory]::CreateRunspacePool(1, $maxThreads)
+
     echo | set/p="%%f ~~~ "
$rs.Open()
+
    certutil -hashfile "%%f" MD5 | findstr /V ":"
 
+
)
$tasks = Get-ChildItem -Path $srcDir -File -Recurse | Group-Object Length |
 
     Where-Object Count -GT 1 | ForEach-Object {
 
        $ps = [powershell]::Create().AddScript({
 
            $args[0] | Get-FileHash -Algorithm MD5 |
 
                Group-Object Hash |
 
                Where-Object Count -GT 1
 
        }).AddArgument($_.Group)
 
 
 
        $ps.RunspacePool = $rs
 
       
 
        @{ ps = $ps; iasync = $ps.BeginInvoke() }
 
    }
 
 
 
$tasks | ForEach-Object {
 
    try {
 
        $_.ps.EndInvoke($_.iasync)
 
    }
 
    finally {
 
        if($_.ps) {
 
            $_.ps.Dispose()
 
        }
 
    }
 
}
 
 
 
if($rs) {
 
    $rs.Dispose()
 
}
 
 
</PRE>
 
</PRE>
  
==run.bat==
+
<PRE style="color:white;background-color:black;font-weight:bold;font-size:1.2em;">
<PRE>
+
getsums.bat > output.txt
powershell -ExecutionPolicy Bypass -File duplicate.ps1 > out.txt
 
 
</PRE>
 
</PRE>

Latest revision as of 12:15, 22 April 2025

Originally from Superuser.com

getsums.bat

@echo off
for /R . %%f in (*.*) do (
    echo | set/p="%%f ~~~ "
    certutil -hashfile "%%f" MD5 | findstr /V ":"
)
getsums.bat > output.txt