Ignore Duplicates and Create New List of Unique Values in Excel

I have a column of values that often appear as duplicates. I need to create a new column, of unique values based on the first column, as follows:

Column A   Column B
a          a
a          b
b          c
c
c

This Column B will actually need to appear on a different sheet, within the same workbook, so I assume it will need to work with the sheet2!A1 style format.

I have not had any luck with the Data/Filter menu options as this only seems to work on command. I need column B to update automatically whenever a new value is entered into column A.

509055 次浏览

There is a good guide of how to do this here.

Basically Something similar to:

=INDEX(Sheet1!$A$1:$A$20, MATCH(0, COUNTIF($B$1:B1,Sheet!$A$1:$A$20), 0))

On a sorted column, you can also try this idea:

B2=A2
B3=IFERROR(INDEX(A:A,MATCH(B2,A:A,1)+1),"")

B3 can be pasted down. It will result 0, after the last unique match. If this is unwanted, put some IF statement around to exclude this.

Edit:

Easier than an IF statement, at least for text-values:

B3=IFERROR(T(INDEX(A:A,MATCH(B2,A:A,1)+1)),"")

In the worksheet module for the sheet containing the list:

Private Sub Worksheet_Change(ByVal Target As Range)


Dim rngDest As Range


If Not Intersect(Target, Me.Columns(1)) Is Nothing Then


Set rngDest = ThisWorkbook.Sheets("Sheet2").Range("A1")


Me.Range(Me.Range("A2"), Me.Cells(Rows.Count, 1).End(xlUp)).AdvancedFilter _
Action:=xlFilterCopy, CopyToRange:=rngDest, Unique:=True


End If


End Sub

Totero's answer is correct. The link is also very helpful.

Basically the formula you need is:

B2=INDEX($A$2:$A$20, MATCH(0, COUNTIF($B$1:B1, $A$2:$A$20), 0))

Then press ctrl+shift+enter (or it will not work using a array formula).

Two important things to keep in mind here: The complete list is in cells A2:A20, then this formula has to be pasted in cell B2 (Not B1 as that will give you circular reference). Secondly this is an array formula, so you need to press ctrl+shift+enter or it will not work correctly.

To remove duplicates from a column

  1. Sort the values in column A A->Z
  2. Select column B
  3. While column B is still selected, in the formula input box, enter

    =IF(TRIM(A1)=TRIM(A2),"",TRIM(A1))
    
  4. While Column B is still selected, select Edit -> Fill -> Down (in newer versions, simply select cell B1 and pull down the outer box to expand all the way down in the column)

Note: if column B is on another sheet, you may do Sheet1!A1 and Sheet1!A2.

I have a list of color names in range A2:A8, in column B I want to extract a distinct list of color names.

Follow the below given steps:

  • Select the Cell B2; write the formula to retrieve the unique values from a list.
  • =IF(COUNTIF(A$2:A2,A2)=1,A2,””)
  • Press Enter on your keyboard.
  • The function will return the name of the first color.
  • To return the value for the rest of cells, copy the same formula down. To copy formula in range B3:B8, copy the formula in cell B2 by pressing the key CTRL+C on your keyboard and paste in the range B3:B8 by pressing the key CTRL+V.
  • Here you can see the output where we have the unique list of color names.

In my case the excel was frozen when using the formula of

B2=INDEX($A$2:$A$20, MATCH(0, COUNTIF($B$1:B1, $A$2:$A$20), 0))

because there was many rows (10000). So I did in another way which I show below.

I have copied my original list to a second column and then with the function of Excel "remove duplicates" I could find the list of unique values.

Copied from Microsoft Office Website:

Select all the rows, including the column headers, in the list

you want to filter.

Click the top left cell of the range, and then drag to the bottom right cell.

On the Data menu, point to Filter, and then click Advanced Filter.
In the Advanced Filter dialog box, click Filter the list, in place.
Select the Unique records only check box, and then click OK.

The filtered list is displayed and the duplicate rows are hidden.

On the Edit menu, click Office Clipboard.

The Clipboard task pane is displayed.

Make sure the filtered list is still selected, and then click Copy Copy button.

The filtered list is highlighted with bounding outlines and the selection appears as an > > item at the top of the Clipboard.

On the Data menu, point to Filter, and then click Show All.

The original list is re-displayed.

Press the DELETE key.

The original list is deleted.

In the Clipboard, click on the filtered list item.

The filtered list appears in the same location as the original list.

Source: Microsoft Office Website (link removed, cause dead)

The MODERN approach is to consider cases where column of information come from a web service such as an OData source. If you need to generate a filter select fields off of massive data that has replicated values for the column, consider the code below:

var CatalogURL = getweb(currenturl)
+"/_api/web/lists/getbytitle('Site%20Inventory%20and%20Assets')/items?$select=Expense_x0020_Type&$orderby=Expense_x0020_Type";


/* the column that is replicated, is ordered by <column_name> */


OData.read(CatalogURL,
function(data,request){


var myhtml ="";
var myValue ="";


for(var i = 0; i < data.results.length; i++)
{
myValue = data.results[i].Expense_x0020_Type;


if(i == 0)
{
myhtml += "<option value='"+myValue+"'>"+myValue+"</option>";
}
else
if(myValue != data.results[i-1].Expense_x0020_Type)
{
myhtml += "<option value='"+myValue+"'>"+myValue+"</option>";


}
else
{


}




}


$("#mySelect1").append(myhtml);


});

So for this task First Sort your data in order from A to Z or Z to A then you can just use one simple formula as stated below:

=IF(A2=A3, "Duplicate", "Not Duplicate")

The above formula states that if column A2 data ( A is column and 2 is row number) is similar to A3 (A is Column and 3 is Row number) then it will print Duplicate else will print Not Duplicate.

Lets consider an example, Column A consists Email address in which some are duplicate, so in Column 2, I used the above stated formula which in results displayed me the 2 duplicates cells one is Row 2 and Row 6.

One you got the duplicate data just put filter on your sheet and make visible only the duplicate data and delete all the unnecessary data.

Honestly I followed these examples to a tee and they simply didn't work. What I ended up doing after struggling pointlessly trying to get Excel to work was to just copy the entire contents of my column to NotePad++ where I was able to find an easy solution within minutes. Take a look at this: Removing duplicate rows in Notepad++

Edit: Here is a brief overview of how to do it in TextFX:

Plugins -> Plugin Manager -> Show Plugin Manager -> Available tab -> TextFX -> Install

After TextFX is installed in NotePad++, then you select all your text you want to remove duplicates from, then make sure to check: TextFX -> TextFX Tools -> Sort outputs only UNIQUE lines

Then click "sort lines case sensitive" or "sort lines case insensitive" and it will perform the unique sort.

Find here mentioned above formula with error control

=IFERROR(INDEX($B$2:$B$9, MATCH(0,COUNTIF($D$1:D1, $B$2:$B$9), 0)),"")

where: (B2:B9 is the column data which you want to extract the unique values, D1 is the above cell where your formula is located)

All you have to do is : Go to Data tab Chose advanced in Sort & Filter In actions select : copy to another location if want a new list - Copy to any location In list range chose the list you want to get the records off . And the most important thing is to check : Unique records only .

=SORT(UNIQUE(A:A))

The above formula works best if you want to list unique values in a column.

Another approach, since Excel 2016, is to use Power Query.

Howto:

  • select the data (including the field name),
  • use menu Data > From a table or a range,
  • (Excel will change your sheet into an Excel Table, which is very convenient),
  • in the Power Query Editor, right-click on ColumnA (the column header), and Remove duplicates,
  • in the menu, choose Close and load, choose where you want the result, and you're done, like this.
  • Whenever you want the result table to update, right-click it and choose Refresh.

Benefits :

  • it uses the CPU only when manually updated, which is very convenient for long lists,
  • if you're curious, this offers many other powerful options.

Drawbacks :

  • it doesn't update on the fly (you have to right-click and refresh the result table),
  • people with old version of Excel won't be able to refresh the results table.