|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.mahout.cf.taste.impl.similarity.AbstractItemSimilarity
org.apache.mahout.cf.taste.impl.similarity.EuclideanDistanceSimilarity
public final class EuclideanDistanceSimilarity
An implementation of a "similarity" based on the Euclidean "distance" between two users X and Y. Thinking of items as dimensions and preferences as points along those dimensions, a distance is computed using all items (dimensions) where both users have expressed a preference for that item. This is simply the square root of the sum of the squares of differences in position (preference) along each dimension.
The similarity could be computed as 1 / (1 + distance), so the resulting values are in the range (0,1]. This would weight against pairs that overlap in more dimensions, which should indicate more similarity, since more dimensions offer more opportunities to be farther apart. Actually, it is computed as sqrt(n) / (1 + distance), where n is the number of dimensions, in order to help correct for this. sqrt(n) is chosen since randomly-chosen points have a distance that grows as sqrt(n).
Note that this could cause a similarity to exceed 1; such values are capped at 1.
Note that the distance isn't normalized in any way; it's not valid to compare similarities computed from different domains (different rating scales, for example). Within one domain, normalizing doesn't matter much as it doesn't change ordering.
| Constructor Summary | |
|---|---|
EuclideanDistanceSimilarity(DataModel dataModel)
|
|
EuclideanDistanceSimilarity(DataModel dataModel,
Weighting weighting)
|
|
| Method Summary | |
|---|---|
double[] |
itemSimilarities(long itemID1,
long[] itemID2s)
A bulk-get version of ItemSimilarity.itemSimilarity(long, long). |
double |
itemSimilarity(long itemID1,
long itemID2)
Returns the degree of similarity, of two items, based on the preferences that users have expressed for the items. |
void |
refresh(Collection<Refreshable> alreadyRefreshed)
Triggers "refresh" -- whatever that means -- of the implementation. |
void |
setPreferenceInferrer(PreferenceInferrer inferrer)
Attaches a PreferenceInferrer to the UserSimilarity implementation. |
String |
toString()
|
double |
userSimilarity(long userID1,
long userID2)
Returns the degree of similarity, of two users, based on the their preferences. |
| Methods inherited from class org.apache.mahout.cf.taste.impl.similarity.AbstractItemSimilarity |
|---|
allSimilarItemIDs, getDataModel |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public EuclideanDistanceSimilarity(DataModel dataModel)
throws TasteException
IllegalArgumentException - if DataModel does not have preference values
TasteException
public EuclideanDistanceSimilarity(DataModel dataModel,
Weighting weighting)
throws TasteException
IllegalArgumentException - if DataModel does not have preference values
TasteException| Method Detail |
|---|
public final void setPreferenceInferrer(PreferenceInferrer inferrer)
UserSimilarity
Attaches a PreferenceInferrer to the UserSimilarity implementation.
setPreferenceInferrer in interface UserSimilarityinferrer - PreferenceInferrer
public double userSimilarity(long userID1,
long userID2)
throws TasteException
UserSimilarityReturns the degree of similarity, of two users, based on the their preferences.
userSimilarity in interface UserSimilarityuserID1 - first user IDuserID2 - second user ID
Double.NaN similarity is unknown
NoSuchUserException - if either user is known to be non-existent in the data
TasteException - if an error occurs while accessing the data
public final double itemSimilarity(long itemID1,
long itemID2)
throws TasteException
ItemSimilarityReturns the degree of similarity, of two items, based on the preferences that users have expressed for the items.
itemSimilarity in interface ItemSimilarityitemID1 - first item IDitemID2 - second item ID
Double.NaN similarity is unknown
NoSuchItemException - if either item is known to be non-existent in the data
TasteException - if an error occurs while accessing the data
public double[] itemSimilarities(long itemID1,
long[] itemID2s)
throws TasteException
ItemSimilarityA bulk-get version of ItemSimilarity.itemSimilarity(long, long).
itemSimilarities in interface ItemSimilarityitemID1 - first item IDitemID2s - second item IDs to compute similarity with
NoSuchItemException - if any item is known to be non-existent in the data
TasteException - if an error occurs while accessing the datapublic final void refresh(Collection<Refreshable> alreadyRefreshed)
Refreshable
Triggers "refresh" -- whatever that means -- of the implementation. The general contract is that any
Refreshable should always leave itself in a consistent, operational state, and that the refresh
atomically updates internal state from old to new.
refresh in interface Refreshablerefresh in class AbstractItemSimilarityalreadyRefreshed - Refreshables that are known to have already been
refreshed as a result of an initial call to a Refreshable.refresh(Collection) method on some
object. This ensure that objects in a refresh dependency graph aren't refreshed twice
needlessly.public final String toString()
toString in class Object
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||