Sweet Family DNA Project

Introduction

A whole new field of genealogical research is now opening up, thanks to modern science. It relies on the simple biological fact that the Y chromosome, unlike all others, is passed intact and virtually unchanged from father to son down through the generations. It is therefore possible to do a simple DNA analysis on two men and learn immediately (well, actually the analysis takes a few weeks) whether the two are related on the paternal side or not. In particular, by testing direct male-line descendants of the early Sweet immigrants to the New World, we can learn whether the immigrants were related or not. Obviously, it is necessary to test several such descendants for each immigrant in order to be really sure of the results, but that's all it takes. The testing so far indicates that the various Sweet immigrants were from different families, and so it is now possible to do two things that Sweet genealogists previously only dreamed of: (1) any living male Sweet who is having trouble connecting back to the era of the immigrants can take the DNA test and find out which group is his, and (2) testing Sweets in England or elsewhere can provide direct proof of which Sweet immigrants are related to which Sweets in the old country. Needless to say, it may take some time to build a catalog of all the distinct Sweet families, but every new test brings us closer to the goal.

Table of Contents

How It Works

For an introduction to the field of DNA-assisted genealogy, view Thomas Roderick's write-up. In case you want just the shortest possible description, here it is: the test measures the lengths of a small number of specific sequences (normally called loci or markers) on the Y chromosome. These sequences, like most of the Y chromosome, don't have any known genetic function, but comparing the lengths from different test subjects can reveal how closely they are related.

Note: the test is not designed to reveal any physical characteristics or innate tendencies. The reason it works for our (genealogical) purposes is that the observed changes in sequence length are neither harmful nor helpful; they simply happen now and then, and they persist because the body doesn't notice the difference. These persistent-yet-changeable lengths allow us to tell families apart.

Since this test applies to the Y chromosome, the test subjects have to be male and, in particular, have to have the surname Sweet or similar (with a few exceptions due to adoptions, name changes, and such). If you are interested in helping the study, but are not a potential testee yourself, here's what you can do. Basically, there is a list of prospects, and you just have to work your way down the list until you find one that works...

  1. Yourself - if you are female, that's out; if you are male, but not a Sweet, that's out, and you can also skip #2 and #3; if you are male and married to a Sweet, then just take your wife's point of view for the rest of the list...
  2. Brothers - can often be persuaded to participate for your sake...
  3. Father - also very persuadable...
  4. Uncles or 1st cousins - you just have to ask nicely and/or appeal to their interest (if any) in family history...
  5. 2nd cousins or 1st cousins once removed...
  6. and so on...

Lots of researchers focus on their own ancestors, so that the "and so on" may require research you haven't done yet, but it's still something that should be within reach if you start working on it.

Note: even if you can't find a male-line cousin to take the test, you can help the project in another way -- by making a contribution to our General Fund that is used to help pay for the testing of those who are not so well-off. (See below.)

The goal in all this is to come up with (collectively) at least two male-line descendants of each identifiable Sweet "founder," preferably via at least two different sons of the founder. Assuming the DNA test results agree for the documented descendants of the progenitor, we can "reconstruct" the haplotype (DNA pattern) for that progenitor and then compare against the haplotypes of other progenitors to see if they were related. It's really that simple. Consider, for example, the two John Sweets who came to Massachusetts in the early days and both started out in what became Essex County. There is evidence that the one who settled in Newbury spelled his name both Sweet and Swett, but always pronounced it Swett, and it seems likely that many modern Swetts in the US are descended from him. The other one settled in Salem but soon relocated to Rhode Island (or, at least, his family did -- the founder's death is not recorded, and so it's not clear whether he actually moved or not). At any rate, many people have assumed these two were cousins, but nobody has any proof. A DNA study can settle, once and for all, whether the two were indeed related, thereby moving the whole question from the realm of speculation to the realm of fact. (Read on for the answer to this old conundrum.)

Another example is the line of Sweets in Attleborough, Massachusetts. The earliest confirmed ancestor of this line is Henry Sweet, who first appeared in the public record at the time of his marriage in 1687. Although his descendants lived near the Rhode Island Sweets, there is no known connection other than geographic. A DNA study can demonstrate that a family connection does or does not exist. Similarly, there are many other early Sweets in or near Rhode Island who are assumed to belong to the Rhode Island branch, often on the basis of incomplete or circumstantial evidence. Of course, a finding that all such groups share the same Y chromosome would not, by itself, prove that all are descended from the immigrant John Sweet.

Other variants of the name include SWITT, SWEATT, and SWEAT. It remains to be seen how and whether these are related.

We have arranged with FamilyTree DNA (FTDNA) to offer a reduced, group rate of $99 plus shipping per 12-locus DNA test to members of our project or $124 plus shipping per 25-locus test. (There is also a group rate for the 37- and 67-locus tests, but there is no evidence yet that we need those tests, except in rare cases.) The test kit is very simple and comes in the mail with complete instructions: basically, it contains three swabs to be rubbed on the inside of the mouth to collect loose cells. The swabs are then popped into a preservative and mailed back to the lab. The kit comes with an optional release form that requests FTDNA to give your email address to any present or future project participant who matches you exactly on the DNA test. If you decide not to sign the release form, or simply forget to return it, your privacy will be absolutely protected, and FTDNA will not notify you or anyone else about matches with your DNA. There is also a space where you can write down the country of origin of your ultimate ancestor -- this is optional and has no bearing on the present study.

For more information contact our project coordinator: John Chandler.

 
Back to top
Back to Table of Contents
 

Sweet DNA Fund

FTDNA has established a fund to be used within our project to help defray the costs of DNA testing. The company matched the first $200 of contributions to this fund, but we are on our own now. The intent of this fund is to secure the participation of potential testees who seem likely to contribute to the success of the project as a whole and who otherwise could not (or would not) join. If you would like to make a donation, please visit this site:

http://www.familytreedna.com/group-general-fund-contribution.aspx

Note that donations can be made either on line (by credit card or PayPal) or by mail. It is important to specify the Sweet project on the form, so that the donation is properly credited. In the on-line form, this entails choosing the initial letter "S" and then clicking on Sweet in the menu. You may also specify how the donation is to be used and/or that it is a memorial, by selecting a "donation type" and/or entering a note about the donation. For example, you might select the "Memory of" type and enter a note saying "in memory of John Smith - For testing Sweets in Canada." If your restriction is more complicated, it would be helpful to send an email to the project coordinator, spelling out the details. If you wish to contribute by mail, you fill out the form on-line and print it for sending in.

 

 
Back to top
Back to Table of Contents
 

Project News

 
Back to top
Back to Table of Contents
 

Results and Discussion

In Table 1, 4-, 5-, or 6-digit ID's refer to FTDNA results; a prefix of "N" before the ID means the testing was done originally as part of the Genographic project (see below); a row beginning with a name is the inferred haplotype of the family patriarch. ID's with a mixture of lower-case letters and digits refer to test results from outside sources. Results discovered in the SMGF database) are designated as "sm" followed by a number, for example, "sm10". Other results, located in the Ysearch database, are designated by the Ysearch ID. Results from other labs, contributed by the participants, are uploaded to Ysearch to make them publicly available outside of the circle of this project.

At present, we are still "exploring the territory." Until recently, all of the samples tested belonged to the same general group (known as HAPLOGROUP R1b, the most common such group in western Europe). Even now, there are only a few exceptions, and most of these are from haplogroup I1, the second most common in that area. However, we also have one from haplogroup E3a, which is more common in Africa.

Within the project, we can distinguish five subgroups or patterns confirmed by multiple samples and corresponding to known 17th-century ancestry, but there are also a considerable number of samples with no close matches. For the time being, the unmatched "R1b" samples are displayed along with Pattern 1, since that is the largest group, and the non-matching, non-R1b samples are placed as "Other." Within each of the first two subgroups, there is an evident "majority" pattern, and markers that differ from the majority are colored gray in the table. Note that DYS389ii is tabulated as reported by the testing lab, but that length actually includes two pieces, one of which is already reported as DYS389i. We therefore use the differences between "ii" and "i" for the purpose of comparison in the column marked "389ii". Another complexity comes into play with DYS464, which appears four times in the genome. Since we cannot tell the four instances apart, they are conventionally reported in order of increasing size. In principle, when comparing two haplotypes, one could check off any matching instances within DYS464 as true matches, even if they appear in different columns, but that practice leads to a trade-off between the number of matching markers and the sizes of the mismatches. Note that this same ambiguity also appears with DYS385, DYS459, and other multi-copy markers, but to a lesser extent.

When comparing haplotypes, there are two very different possible contexts. In the general context, when the two persons are not known to be related, and they may have nothing in common besides their surname, the only relevant information may simply be the two haplotypes. The degree of the relationship, if any, is the count of generations from person "A" back to the most recent common ancestor and from there down to person "B". When the two persons are in the same generation, their degree of relationship is just twice the time (measured in generations) from their common ancestor. This time is generally abbreviated as TMRCA. How do we get from the haplotypes to the relationship or TMRCA? It is important to remember that we can never deduce the relationship exactly from just the DNA testing. However, there is a fairly simple procedure for getting a approximate answer. By counting up all of the differences, we get what is called the "genetic distance" and we can estimate the TMRCA, given the number of markers compared and the average rate of mutation for those markers. In a separate figure, we display a graph of estimated TMRCA (times two) against genetic distance.

Note that the genetic distance as described here is an over-simplification, since two discrepancies of one step each do not really count the same as one discrepancy of two steps. For example, when the genetic distance is zero, the most likely TMRCA is also zero. (To put it another way, when two samples match exactly, we would suspect they were taken from the same person if we didn't know better.) When the genetic distance is one step out of 25 markers with an average mutation rate of 0.0023 per generation, the most likely relationship is 17 generations counted up and back. Of course, this is only an estimate! There is an additional conversion from generations to years which depends on the average generation length. We use a round number of 30 years per generation in this discussion.

The other context for comparing haplotypes occurs when there is a known relationship between the two persons being compared. Then, the question under consideration is whether the DNA results are consistent with the conventional genealogy. This question is not as simple as it may seem, since the expected answer is "yes" or "no," but the real answer is a statistical one. The relatively simple plot mentioned in the previous paragraph is unsuited to this sort of question because it doesn't qualify the estimates with their reliability. Another figure shows the uncertainties of these estimates. Each point is plotted as an "X" with a line extending up and down from the point by an amount equal to the statistical uncertainty in the estimate. Note that estimates can be off by more than this amount!

12-Marker Results

The haplotypes now in hand do include four 12-marker exact matches. The largest group of these matches forms the majority of Pattern 1, as shown in Table 1, and includes John SWEET of Salem. Nearly all of these now have 25-marker results, and a clear majority match 25/25 as well. We discuss the 25-marker results in the next section.

Four other results are only one step away from this bunch of 12-marker matches. Three of these, 13407, 23714, and 60265, all have the same one-step mutation and therefore constitute a second group of exact matches within Pattern 1. Although that result could be a mere coincidence, it seems more likely that these test subjects share a common ancestor who passed that mutation on to all three of them. Since 23714 is the one with the nearest "brick wall," this shared mutation may be a vital clue in tracing his ancestry. On the other hand, all three have now extended their tests to 37 markers and found two discrepancies at that level (both discrepancies on a pair of markers with a high mutation rate). This result leaves open the question of whether their shared mutation on DYS439 is a true link or just a coincidence.

Besides these matches, we have four more, one making up the majority of Pattern 2 and one each in Patterns 3, 4, and 5. As the project grows, we can expect more matches, since there are now many unique haplotypes, and each one may find kin at any time.

Within Pattern 1, there is quite a variety of haplotypes, but most of them, when considered in pairs on the first 12 markers, are not so very different as to positively preclude them from sharing a common Sweet ancestor. For example, based on the statistics of other DNA studies, and counting only the first 12 loci, the estimated date of the most recent common ancestor for the cluster and 9678 is about five centuries ago, give or take four centuries (and the same for 9678 and 15624 or for 9678 and 16328); for 9678 and 9958, it is about eleven centuries ago, give or take seven; for 16328 and the cluster of matches, it is about nine centuries, give or take five.

Unfortunately, as the previous paragraph shows, testing only 12 loci leaves a rather broad uncertainty in the estimated relationships among members of a populous haplogroup. Even the distinctively different Pattern 2 is only three steps away from Pattern 1 on the first 12. We get a much clearer picture with 25 loci, as discussed below, and it becomes clear that some of the Sweets in Pattern 1 are unrelated to others within a genealogical time frame.

25-Marker Results

Because of the technology and economics of DNA testing, it is customary to measure several loci in one procedure, called a multiplex. A combination of one or more multiplexes, offered as a "package" test, is often called a "panel." Thus, the first 12 loci constitute FTDNA's first panel, as well as the entry-level test. The next 13 constitute the second panel, and these two panels together make up the next-higher-level test. There is no option to test the second panel without the first. (It may be possible to test the second panel "alone" at some combination of other labs, but inter-lab standardization is one of the weak points in the DNA testing industry.) We recommend that participants in this project take the 25-marker test.

The tight bunch of exact matches opens up a little when we consider 25 markers, though a majority of these results still agree exactly at 25/25. For simplicity, we refer to the shared 25-marker haplotype of 9866 and the other exact matches as "the cluster." 12256, 33402, 47939, 60265, and 105761 are only one step away from the cluster, while 18443, 28741, 30688, and 37161 are two steps. 9579, on the other hand, is seven steps from 9866 and eight from 12256. (However, 9579 may be a special case due to a rare type of mutation that affects several markers at once.)

The three samples mentioned above (13407, 23714, and 60265) as being close in terms of 12 markers are also very close to the cluster (only one step in 25 markers) and almost as close to 12256 (two steps). In addition, these three are an exact match to each other at 25 markers. Our conclusion is that the cluster members have inherited the ancestral haplotype of this group, while 12256 and others have inherited one mutation each; 18443 and others have inherited two; and so on. In general, the most common haplotype found in a lineage group is indeed the haplotype of their common ancestor.

The first two participants to join this project, 9579 and 9678, are both within one step of the cluster as viewed with 12 loci, but 25-marker testing has set them apart (seven steps each from the cluster pattern and six steps from each other). These results show them to be very likely unrelated to each other and to the cluster as well (although six of the seven steps of difference for 9579 could be explained as a single recombinant event). Note that neither one has been traced back by conventional genealogy to the common ancestor of the cluster.

These 25-marker results illustrate one of the pitfalls of DNA testing: apparent matches based on 12-marker results can be deceptive. The match on 12 markers, in combination with the fact of a shared surname, would normally indicate that those testees are rather closely related (i.e., on a time scale of hundreds of years or less). However, the 25-marker results show a much wider variety of haplotypes within Pattern 1, with differences indicating a separation of perhaps a thousand years or more. How can such a discrepancy occur? The answer is in the relatively small number of loci tested and the relatively slow rate of mutation. All the time estimates carry a large uncertainty. It is still possible that these testees are a bit less distant than they now seem, but it does appear that we have different groups of Sweets with the same or similar 12-marker haplotypes. We are thus unsure of exactly how they are related. All we can say is that they are probably not descended from the same immigrant Sweet. When we have consistent results from documented descendants, or a clear consensus of testees on a regional basis, the whole picture will be clearer.

The TMRCA estimates change in two ways when we consider 25 loci: first, some of the estimated times become much longer, and, second, the uncertainties in the estimates are smaller (but not tight enough to be considered "precise"). Counting all 25 loci, the estimated time has jumped to about 14 centuries for 9579 and the cluster, give or take five centuries. (If the differences are interpreted as due to recombination, however, the estimate becomes six centuries, give or take four.) Depending on whether the true date falls near the beginning or end of these still rather broad ranges, the similarity of the DNA could be either a coincidence having nothing to do with the shared SWEET surname or the direct result of descent from a common SWEET ancestor.

As is clear from the shading, the samples shown at the end of Pattern 1 are all very different from each other and from the cluster. For these, taken against each other in pairs, or taken one at a time against the cluster, the estimated dates of common ancestor are all more than a millennium ago, with only two exceptions. First, the estimate for 9678 and 15624 is only nine centuries back, give or take four. These two might therefore be related within a genealogically useful time frame, but it is by no means certain. The other exception is 22528, who matches 9958 11/12 on the first panel of markers and matches 9579 13/13 on the second panel. The estimate for 22528 and 9579 based on all markers is also nine centuries. All the other pairs are probably not related within genealogical time.

Because of the striking coincidence that 9579 matches the cluster perfectly on the first panel and matches 22528 perfectly on the second, we considered the possibility of a lab mix-up. However, FTDNA obligingly reran several tests, including the second panel for 9579 and the first panel for 22528, and confirmed the original findings. This leaves us with the striking coincidence intact. Because of the first-panel agreement between 9579 and the cluster, we might argue that his second-panel results are due to a recombinant event that simultaneously equalized his multi-copy DYS459 and DYS464 markers, but the same argument cannot be made for 22528, since his first panel differs sharply from the cluster. This combination of results is one of the cases where 37-marker testing could be very useful in illuminating the relationships.

22528 is interesting for another reason. As shown on the lineage page, he and 18443 claim a common ancestor (Eber Sweet) five generations ago. However, their DNA does not match. Therefore, we must conclude that there is an error somewhere in the line for at least one of them. Additional testing has added to the picture, however. 30688 traces back to a supposed brother of Eber, and, although the initial report of the results for 30688 showed a substantial difference between him and 18433, a subsequent correction reduced the difference to two steps. What's more, they share a mutation at DYS449 from the rest of the Pattern 1 cluster. The genealogical evidence linking the supposed brothers is only circumstantial, but this shared mutation goes a long way toward confirming the close tie. In the meantime, another descendant of Eber Sweet has now taken the DNA test, and his results are illuminating. He is only one step away from the ancestral Pattern 1 haplotype, and that one step of difference is the same shared mutation in 18443 and 30688. This comparison supports the theory that all three are closely related. If another descendant of Elijah could be recruited, the additional comparisons could help to confirm this theory. That still leaves the question of 22528, whose DNA does not match the others. Perhaps more testing of Eber descendants would pin down the point of departure.

In the meantime, another member of the project has turned up with the same mutation at DYS449 shared by the three men discussed above. This result suggests that the new member, N44777, may be closely related to the other three.

Pattern 5 is at present only that -- a DNA pattern. There is no known genealogical connection between the two members of this cluster, and, in fact, there is good reason to believe that the nearest possible connection is four centuries ago. 21864 has traced his lineage back to Henry SWEET of Attleborough, probably a first-generation immigrant to New England, although there is a wide-spread tendency to confuse this Henry with a grandson of John SWEET of Salem. Before we can pin down Henry's haplotype, we must locate another descendant (preferably from Henry's son John) to confirm the DNA reading. In the meantime, however, this result is a 24/25 match with that of 134688, whose ancestor immigrated in 1853.

One member has received a very unusual DNA report - instead of 25 markers, he apparently has 29, including four extra copies of DYS464. Indeed, this situation is so unusual that FTDNA has no procedure for reporting more than three extra copies, and so the testee's results were reported with only 28 markers. However, inquiries sent to the lab about the difficulties of assessing the test for DYS464 elicited the response that they were very sure and, yes, there were eight copies, not just the reported seven.

37-Marker Results and Beyond

Many participants have now extended their tests to 37 or 67 loci at FTDNA or have obtained results from other labs for loci beyond the first 25. The cluster of descendants of John Sweet of Salem, in particular, now has a clear consensus through 37 and the beginnings of a consensus at 67. Indeed, one member of the cluster agrees exactly with the consensus to 37 (and thus with the presumed ancestral pattern). Of course, this agreement does not make him in any sense a closer relative of John Sweet than the others, and in fact he is currently the most distant by count of generations.

Pattern 2 and Pattern 3 both also have the beginnings of a consensus to 37 markers, but the consensus in Table 2 for Pattern 2 is rather spotty because the two bona fide members have tested only 4 markers in common in Table 2. Between them, they cover all but one of the loci in tables 2 and 3 (50 in all), but we have no way to be sure that all of these results are ancestral. In contrast, Pattern 3 has only two members, and both have tested 37 and agree on 36 of the 37.

The shading of Table 2 and Table 3 is done according to the same rules as in Table 1.

Results for Similar Surnames

We now have a number of test results for subjects named SWETT, mostly descendants of an early Massachusetts immigrant, John Swett of Newbury, who was long thought to be related to John Sweet of Salem. These Swetts are quite different from the cluster of Sweets, apparently separated by over three millennia and in some cases much longer. Among these, we have another cluster as well as some that stand apart (more about them in a moment). Within this Swett cluster, 20321 and two others have the central haplotype, and the others differ by one step each, indicating that they are all probably related through a common ancestor between 2 and 43 generations ago. Indeed, their documented mutual common ancestor is 9-11 generations back from all of them.

We have, in addition, one member who exactly matches the apparent ancestral haplotype, but whose surname is not Swett. As it happens, this person had long been searching for evidence of his biological grandfather. One (and only one) of the candidates was a Swett. Therefore, this DNA test, along with the previous documentary research, almost certainly confirms the Swett as the true grandfather and provides a line back to the immigrant John of Newbury.

sm10 is a special case. This test result was discovered in the Sorenson Molecular Genealogy Foundation (SMGF) on-line database, including a pedigree extending back to John Swett of Newbury. Although the name of the test subject was suppressed for privacy, it is evident that this family has spelled the name as SWEAT since the 1700's. Unfortunately, despite the detailed pedigree, the DNA test results do not match the others. One possible explanation can be seen in the pedigree itself, where John SWEAT's birth in 1789 is listed as being only seven months after his parents' marriage. There is no reason to focus specifically on this timing irregularity, but it could be a break in the lineage, and only one break is needed to explain the DNA discrepancy. It would be helpful to test others in this line, such as descendants of the younger brothers of John Sweat, to determine where the break might be. One such test has already been partly done at SMGF, in the person of sm25, whose pedigree indicates he is a second cousin once removed to sm10, and whose incomplete DNA results show him to be close to the ancestral haplotype, matching the consensus on 16 of the 17 comparable loci. Unfortunately, it appears that the test of sm25 failed to produce usable results for many loci, and we may never learn exactly how close he is to the ancestral pattern. If we assume that the 16/17 match is indicative of the closeness of the missing results, we must conclude that the break in the lineage for sm10 is quite recent. (See the lineage page.)

We have three more special cases. 60470 is from another Massachusetts SWETT family which has been said to be related to the Newbury line. However, the DNA testing shows no similarity at all (with a separation perhaps in the tens of thousands of years). 62552 and 63636 are from a family that has spelled the name SWEAT and SWETT. They all share the distinction of belonging to a different haplogroup (see below) from Pattern 1 and Pattern 2, but 62552 and 63636 are still very different from 60470.

Although the names SWEET and SWETT have long been considered variants of each other, we must conclude that the particular instances now in the project are not related. More precisely, we have found clusters of project members who are related to each other, but no cluster contains both surname variants.

Another variant of the name is SWEAT. As described above, some families have switched back and forth between the SWETT and SWEAT spellings, but it's not yet clear whether the two spellings have different origins. We have one participant with the SWEAT surname who ordered a test kit, but he has not sent the kit back to the lab. Despite that setback, we have one Sweat result anyhow for comparison, even though not a member of our project. FTDNA graciously did a 12-marker comparison for us between their Sweat customer and all of our members and reported that no haplotype in our project (as of 2004 Nov 2) came within four steps of the Sweat's haplotype. In September of 2005, we gained another SWEAT participant via the Genographic Project of the National Geographic Society. Because the pronunciation of SWEAT matches that of SWETT, we might expect him to match the descendants of John SWETT or Pattern 3, if anybody, but the fact is that his closest DNA neighbors (all those within four steps) are SWEETs. We therefore list him with the first group in Table 1. Based on the 2004 report from FTDNA, we know that he cannot be within two steps of the original SWEAT customer. We also have two members named SWEATT who are apparently not related to any of the SWEAT members, nor to each other.

Lineages

To show these results in the context of ordinary genealogy, we have a page with the lineages supplied by the participants. For the most part, the two types of evidence agree -- most of the participants whose DNA matches also share a known common ancestor, and most of those who claim common ancestry also match. The exceptions are marked with footnotes.

FTDNA-only Results

Although the project accepts test results from any lab, most of the testing so far has been done through FTDNA. Therefore, the automatically updated web site provided by FTDNA includes most of the project results. Indeed, it shows them immediately, as soon as they are returned from the lab, and so you may wish to visit that site to see the comparison of new results with those already posted here. The only problems are that the FTDNA site doesn't indicate the discrepant loci by shading and is sorted by allele lengths from left to right within each pattern grouping, sometimes making it awkward to pick out and compare specific kits.

 
Back to top
Back to Table of Contents
 

Global Comparisons

In population genetics, individual haplotypes are classified into broad categories called haplogroups. A large majority of Europeans fall into three of these, designated Haplogroup R1b, Haplogroup I, and Haplogroup R1a (known as hg1, hg2, and hg3, respectively, in an older nomenclature). We anticipate that the same will be true for the Sweets. Indeed, nearly all so far are R1b, with only a few others. One consequence of this clustering is that some individuals are seemingly very similar to others, to the extent that the majority of loci agree. However, to establish relatedness (on a genealogically interesting time scale) requires that the results be virtually identical. In any case, since "R1b" and "I" are both widespread in Europe, the DNA results cannot generally pin down the ethnic origins of the Sweet lines in terms of Anglo-Saxon vs. Norman vs. French vs. "other." R1a, when found in England, is usually considered to be an indication of Scandinavian origin, but it may also come from Germany or eastern Europe. (To date, we have not found any R1a in the project.) The one exception to this uniformity is the member whose haplogroup is E, which is rare in Europe, but common in Africa.

One way to get an idea of where the samples fit into the global scene is to search for matches in the databases maintained by forensic DNA researchers. The largest such database, then known as YSTR, contained 13,223 anonymous haplotypes from all over Europe as of November of 2003. This database tabulates nine (or in some cases ten) of the twelve basic markers used by FTDNA. It is thus much less specific than the commercial test, but it still serves to indicate whether a given haplotype is relatively common or rare. (A portion of the database includes two extra markers, one of which is also one of the FTDNA basic set, but only a small portion.) Thus, for example, the haplotype of the cluster (and others) appears in 37 of the 13,223 samples. These matches are spread across Europe from Spain to Lithuania, with several in England as well. 9678 has even more: 92 matches similarly spread all across western and central Europe. As it happens, the one-step difference between the cluster and 9678 places the latter one step closer to the modal haplotype of "R1b." Also, 9958 falls even closer to the mode of "R1b," and we find an even higher frequency: 375 matches, again spotted all over Europe. These high frequencies are part of the explanation of why unrelated Sweets in particular can appear to be near-matches, or even exact matches, on a limited set of loci.

In 2004, the YSTR database was reorganized to include samples from other continents (previously stored separately). It is now possible to search the global database with a geographic or ethnic focus.

FTDNA maintains three databases of Y DNA results, all searchable by customers only. One database gives the haplogroups found for test subjects whose haplotypes come close to the customer's own haplotype. Another database gives the claimed ethnic origins of near matches. Both of these are anonymous databases containing a mixture of customers and academic test subjects. The third database is just customers and includes names and email addresses, but it reveals these only under certain circumstances. First of all, for 12-marker testees, it looks only for exact 12/12 matches or one-step-off near-matches within the same surname group. For 25-marker testees, it looks first for 12-marker-type matches and then for 24/25 or 25/25 overall and finally for 23/25 (provided that the two discrepancies are no more than one step each). The scope of these searches is limited to those who have sent in a signed release form, but there is an optional limitation on top of that. By default, each customer is set up for "private" searches, i.e., limited only to members of the same surname study project. However, that setting can be changed to "public" to include matches or near-matches among all the customers who have similarly opted for "public" searches. Instructions for changing this setting can be found at the FTDNA web site.

FTDNA also operates a public database called YSEARCH, where anyone can upload a haplotype and related genealogical data, and anyone can search for matches by surname or by haplotype. A similar database, called YBASE, was operated by DNA Heritage, but shut down when that company went out of business.

Another public database exists for seeking genetic matches. The SMGF is conducting a world-wide research project that involves collecting DNA samples and trying to correlate the patterns with past homelands. The Y DNA data, along with the associated pedigrees, have been made available on-line. Originally, these data could not be searched by surname, but that capability was added in August of 2005. Even so, the user interface is designed primarily to search for matches to a haplotype that the user must supply, and the non-matching markers are reported only as "not a match" instead of actual repeat counts. In any case, the surname of each matching test subject is displayed on the search results page. Note: to save the effort of manually entering the haplotype for searching, we have an index of the haplotypes currently known for members of our project. Within this index, you can click on a kit number (or the generic haplotype of a lineage founder) to search the SMGF database for all matches and near-matches.

 
Back to top
Back to Table of Contents
 

Data

Table 1. Sweet Haplotypes - primary loci
(click on an arrow at the end of a row to see the continuation in Table 2)
DYS
Locus:
3
9
3
3
9
0

1
9
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
i
3
9
2
3
8
9
ii
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
ID
Pattern 1 - John SWEET of Salem and others
John 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
9866 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
11087 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
12091 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
12256 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 26 15 19 29 15 15 17 18
12967 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
13407 13 24 14 10 11 14 12 12 11 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
16425 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
18443 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 14 19 28 15 15 17 18 >
18826 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
20471 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
22010 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
23714 13 24 14 10 11 14 12 12 11 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
28741 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 30 15 15 16 18 >
28939 13 24 14 10 11 14 12 12 12 14 13 31 18 9 9 11 11 25 15 19 29 15 15 17 18
30041 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
30073 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
30252 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
30688 13 24 14 10 11 14 12 12 12 14 13 30 17 9 10 11 11 25 15 19 28 15 15 17 18
33402 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 28 15 15 17 18
37161 13 24 14 11 11 14 12 12 13 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
37522 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
47939 13 25 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
60265 13 24 14 10 11 14 12 12 11 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
64064 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
75876 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18
105761 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 12 11 25 15 19 29 15 15 17 18
108702 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
131837 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
148196 13 24 14 10 11 14 12 12 12 15 13 31 18 9 10 11 11 25 15 19 29 15 15 17 18 >
176041 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
182125 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
194073 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
197117 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
241716 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 15 17 18 >
257880 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 28 15 15 17 18 >
N29526 13 24 14 10 11 14 12 12 12 14 13 30
N44777 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 28 15 15 17 18 >
N73556 13 24 14 10 11 14 12 12 12 14 13 30
btrbz 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 29 15 17 18 >
9579 13 24 14 10 11 14 12 12 12 14 13 30 17 9 9 11 11 25 15 19 29 15 15 15 15
9678 13 24 14 11 11 14 12 12 12 14 13 30 17 9 10 11 11 25 15 18 30 15 15 16 16 >
15624 13 24 15 11 11 14 12 12 12 14 13 30 17 9 10 11 11 25 15 19 30 15 15 17 17
28483 13 24 14 11 11 15 12 12 12 13 13 29 18 9 9 11 11 24 14 19 29 14 15 17 17
N35858 14 24 14 11 11 15 12 12 12 13 13 29
9958 13 24 14 11 11 14 12 12 13 13 13 29 18 9 9 11 11 25 15 19 30 17 17 17 17
22528 13 24 14 11 11 14 12 12 13 13 13 30 17 9 9 11 11 25 15 19 29 15 15 15 15
N8230 13 24 14 11 11 14 12 12 12 13 15 29
51809 13 24 14 11 11 15 12 12 11 13 13 29 17 9 9 11 11 26 15 19 27 15 15 16 17
58770 13 25 14 10 11 14 12 12 11 13 14 29 17 9 10 11 11 25 15 18 30 15 16 16 17 >
107413 13 24 15 11 11 15 12 12 12 13 13 29 17 9 10 11 11 25 15 19 29 15 15 16 17 >
128002 13 23 14 10 11 14 12 12 12 13 13 29 18 9 9 11 11 25 15 18 29 15 15 18 19 >
177518 13 23 14 11 11 14 12 12 11 12 13 28 18 9 10 11 11 25 15 19 30 15 16 17 18
235581 13 23 14 11 11 14 12 12 12 13 14 29 17 9 10 11 11 25 14 19 30 15 15 17 18 >
278794 13 24 14 10 11 14 12 12 12 14 13 30 18 9 10 11 11 25 15 19 31 15 15 16 18 >
N76454 13 23 14 11 11 15 12 12 12 13 13 29
mbpr6 13 23 14 11 11 14 12 12 12 13 13 29 17 9 10 11 11 24 14 19 29 15 16 16 18 >
Pattern 2 - John SWETT of Newbury
John 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17
14648 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 32 14 14 15 17
17174 13 23 14 10 11 14 11 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17
20321 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17
21194 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 25 15 19 31 14 14 15 17
26162 13 23 14 10 11 15 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17
61736 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17
73554 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 14 17
77015 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17
81177 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 30 14 14 15 17
154750 13 23 14 10 11 13 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 16 >
270295 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17
N15262 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17 >
N85364 13 23 14 10 11 14 12 12 11 13 13 29 18 8 10 11 11 24 15 19 31 14 14 15 17 >
sm10 13 23 15 11 10 15 12 12 12 13 13 29 18 9 10 11 11 24 15 19 29 15 16 17 18 >
sm25 13 14 18 8 10 11 11 15 31 14 15 15 17 >
Pattern 3 - Robert W. SWETT b. 1858
62552 13 22 14 9 13 15 11 14 11 12 11 28 15 8 9 8 11 23 16 20 29 12 15 15 15 >
63636 13 22 14 9 13 15 11 14 11 12 11 28 15 8 9 8 11 23 16 20 29 12 15 15 15 >
190589 13 22 14 9 13 14 11 14 11 12 11 28 15 8 9 8 11 23 16 20 29 12 15 15 15 >
Pattern 4 - Sweets of Devon
16328 13 24 14 11 11 14 12 12 12 14 14 30 16 9 10 11 11 25 15 19 28 15 15 17 17 >
83307 13 24 14 11 11 14 12 12 12 14 14 30 16 9 10 11 11 25 15 19 28 15 15 17 17
Pattern 5 - unknown connection
21864 13 24 14 10 11 11 13 12 11 13 13 28 17 9 9 11 11 25 15 18 34 15 16 17 17
134688 13 24 14 10 11 11 13 12 11 13 13 28 18 9 9 11 11 25 15 18 34 15 16 17 17 >
Other
60470 12 24 14 10 15 16 11 14 12 12 11 28 16 8 9 8 11 23 16 20 28 14 15 16 16
80147 15 21 16 10 16 17 11 12 11 12 11 29 16 8 10 11 11 27 14 21 30 13 13 15 15 16 16 17 18 >
p5cnx 15 21 16 10 16 17 11 12 11 12 11 29 16 8 10 11 11 27 14 21 30 13 15 16 18 >
86497 12 22 14 10 12 14 11 14 11 12 11 28 14 8 9 8 11 24 16 20 30 12 14 16 16
142293 13 22 14 10 14 14 11 14 11 12 11 28 15 8 9 8 11 23 16 20 26 12 13 14 15 >
294285 14 25 13 9 17 18 11 12 11 13 11 31 16 9 9 11 11 21 14 20 28 14 15 15 17 >
N110904 13 25 17 11 11 14 12 12 10 13 11 30 16 9 10 11 11 24 14 20 31 12 15 15 16 >
gvz3a 12 23 14 12 11 15 12 12 11 13 13 30 18 9 10 11 11 24 15 19 29 15 15 16 16 >
B1333 13 24 14 10 11 15 12 12 11 13 13 29 16 9 10 11 11 24 15 19 29 15 15 17 17 >

 
Table 2. Sweet Haplotypes - Other loci (FTDNA and Sorenson)
(click on an arrow at the end of a row to see the continuation in Table 3)
Note: DYS461 has been converted to new nomenclature as of 2004 June.
Note: DYS452 and DYS463 have been converted to new nomenclature as of 2010 March.

Locus:
 

4
6
0

H
4
Y
C
A
II
a
Y
C
A
II
b

4
5
6

6
0
7

5
7
6

5
7
0
C
D
Y
a
C
D
Y
b

4
4
2

4
3
8
#
4
4
1
*
4
4
4
#
4
4
5
*
4
4
6
#
4
5
2
#
4
6
1
#
4
6
2
#
4
6
3
#
A
1
0
#
+
6
3
5
#
1
B
0
7
ID
John 11 10 19 23 15 15 17 18 37 38 12 12
9866 11 10 19 23 15 15 17 18 37 38 12 12
11087 11 10 19 23 15 15 17 18 37 39 12 12
13407 11 10 19 23 15 15 17 18 36 38 12 12
18443 11 10 19 23 15 15 17 18 37 38 12 12
23714 11 10 19 23 15 15 17 18 37 37 12 12
28741 11 10 19 23 15 15 17 18 36 39 12 12
37522 11 10 19 23 15 15 17 18 37 38 12 12
47939 11 10 19 23 15 15 17 18 37 38 12 12
64064 11 10 19 23 15 15 17 20 37 40 12 12
60265 11 10 19 23 15 15 17 18 37 38 12 12
108702 11 10 19 23 15 15 17 18 38 39 12 12 12 13 >
131837 11 10 19 23 15 15 17 18 37 38 12 12
176041 11 10 19 23 15 15 17 18 38 39 12 12
148196 11 10 23 23 15 15 17 18 38 38 12 12
182125 11 10 19 23 15 15 17 18 37 39 12 12
194073 11 10 19 23 15 15 17 18 37 39 12 12
197117 11 10 19 23 15 15 17 18 37 39 12 12
241716 11 10 19 23 15 15 17 18 38 39 12 12
257880 11 10 19 23 15 15 18 18 37 39 12 12
N44777 11 10 19 23 15 15 17 18 37 38 12 12 13 12 12 14 30 11 11 25 12 23 10 >
btrbz 11 10 19 23 15 15 18 12 12 12 30 25 >
9678 11 12 19 23 16 15 18 18 39 40 11 12
58770 11 11 19 23 15 16 17 17 39 39 12 12
107413 11 11 19 23 16 16 17 17 35 39 11 12 12 13 >
128002 11 10 19 23 17 15 18 17 38 38 12 12
235581 11 11 19 23 17 15 17 17 37 38 12 12
278794 12 11 19 23 16 15 18 18 37 38 11 12 13 13 >
mbpr6 11 10 19 23 17 13 12 13 13 12 11 13 23 >
154750 11 11 19 23 15 14 16 18 37 37 12 12 14 13 >
N15262 11 11 19 23 15 14 16 18 37 37 12 12 14 13 >
N85364 11 11 19 23 15 14 16 17 37 37 12 12 13 14 12 13 30 12 11 25 12 24 10 >
sm10 11 10 19 23 17 13 12 13 12 13 13 30 12 11 24 13 23 10
sm25 15 12 13 14 12 13 30 12 11 12 24 10
62552 10 11 19 21 14 15 16 21 31 38 12 10
63636 10 11 19 21 14 15 16 20 31 38 12 10
190589 10 11 19 21 13 15 16 20 31 37 12 10 13 13 >
16328 11 10 19 23 17 14 18 18 33 36 12 12 12 13 >
134688 10 11 19 23 15 14 19 17 36 37 12 12
80147 11 10 19 21 16 13 18 18 31 33 12 11
p5cnx 11 10 19 21 16 12 11 14 12 11 14 30 13 12 20 11 21 11
142293 9 9 19 21 15 15 15 19 35 35 12 10
294285 10 10 19 19 15 13 12 16 33 34 13 10
N110904 11 11 19 23 15 15 17 18 36 38 12 11 14 12 >
gvz3a 10 11 19 23 15 12 12 13 12 12 13 30 13 11 24 12 23 10
B1333 11 11 19 23 15 15 19 17 36 36 13 12 13 12 12 13 30 12 11 24 14 24 10
* Also displayed in Table 3
# Also displayed in Table 4
+ Formerly called Y-GATA-C4

 
Table 3. Sweet Haplotypes - More loci (FTDNA or other)
(click on an arrow at the end of a row to see the continuation in Table 4)

Locus:
 

5
3
1

5
7
8

3
9
5
a

3
9
5
b

5
9
0

5
3
7

6
4
1

4
7
2

4
0
6
s1

5
1
1

4
2
5

4
1
3
a

4
1
3
b

5
5
7

5
9
4

4
3
6

4
9
0

5
3
4

4
5
0
*
4
4
4

4
8
1

5
2
0
*
4
4
6

6
1
7

5
6
8

4
8
7

5
7
2

6
4
0

4
9
2

5
6
5
ID
108702 11 9 15 16 8 10 10 8 10 11 12 23 23 16 10 12 12 14 8 12 23 20 13 12 11 12 11 11 13 12
N44777 11 9 15 16 8 10 10 8 10 11 12 23 23 16 10 12 12 14 8 12 23 20 14 12 11 12 11 11 13 12 >
btrbz 11 16 12 23 >
107413 11 9 15 16 8 10 10 8 10 10 12 21 23 16 10 12 12 15 8 12 22 20 13 12 11 13 11 11 13 12
278794 11 9 15 16 8 10 10 8 10 9 12 23 23 15 11 12 12 16 8 13 23 20 13 12 11 13 11 11 12 12
mbpr6 11
154750 11 9 15 16 8 10 10 8 10 10 12 23 23 16 10 13 12 15 8 14 22 20 13 12 11 13 11 11 13 12
N15262 11 9 15 16 8 10 10 8 10 10 12 23 23 16 10 13 12 15 8 14 22 20 13 12 11 13 11 11 13 12
N85364 11 9 15 16 8 10 10 8 10 10 12 23 23 16 10 13 12 15 8 14 22 20 13 12 11 13 11 11 13 12 >
190589 11 11 15 15 8 12 10 8 9 9 12 22 25 15 10 12 12 16 8 13 25 20 13 13 11 12 11 11 12 11
16328 11 9 15 16 8 10 10 8 11 10 12 23 23 16 10 12 12 16 8 12 24 19 13 12 11 13 11 11 12 12
N110904 12 8 17 17 8 12 10 8 11 11 12 22 22 17 11 12 12 13 8 14 23 21 12 12 11 13 11 11 12 13
* Also displayed in Table 2

 
Table 4. Sweet Haplotypes - More loci (FTDNA or other)

Locus:
 

7
1
0

4
8
5

6
3
2

4
9
5

5
4
0

7
1
4

7
1
6

7
1
7

5
0
5

5
5
6

5
4
9

5
8
9

5
2
2

4
9
4

5
3
3

6
3
6

5
7
5

6
3
8
#
4
6
2
#
4
5
2
#
4
4
5
#
A
1
0
#
4
6
3
#
4
4
1
#
1
B
0
7

5
2
5

7
1
2

5
9
3

6
5
0

5
3
2

7
1
5

5
0
4

5
1
3

5
6
1

5
5
2

7
2
6
#
6
3
5

5
8
7

6
4
3

4
9
7

5
1
0

4
3
4
#
4
6
1

4
3
5
ID
N44777 35 15 9 16 12 27 26 19 13 11 13 12 10 9 12 12 10 11 11 30 12 12 25 13 10 10 20 15 19 13 24 16 12 15 24 12 23 18 11 14 17 9 11 11
btrbz 10 30 12 25
N85364 34 15 9 16 12 27 26 19 12 11 13 12 10 9 12 12 10 11 11 30 12 12 25 13 10 10 20 15 19 13 23 16 12 16 25 12 24 18 10 14 18 9 12 11
# Also displayed in Table 2

 

 

 

 

 

 
Back to top
Back to Table of Contents
 
[Back to Fairlea Farm home page] [Back to genealogy home page] [Lineage page] [SMGF search page]
 
Last update: 2013 Dec 6

Valid HTML 4.01!