Lumper or Splitter for one name study


johnfirr@...
 

Just a question out of interest really. I started using FH for family history research and adopted wnat I believe to be a "splitter" technique for recording sources i.e I raise an individual source for every record for every individual. This has the advantage that it allows me to record lots of detail and its very specific when searching, however it is relatively slow as there is lots of data entry for just one event ( say a Birth index entry).
Last year I transferred into carrying out a one name study and have continued to "split" which has not been a huge problem since the main name is very rare so if wanted to enter say all of the births on Findmypast for that name in the UK since 1837 I only have a couple of hundred.

However I have now started on one of the variants which whilst not huge is an order of magnitude higher so for instance I have just pulled all of the UK births for that variant from Free BMD and have a list of over a thousand. Entering these using splitting means also originaltng 1500 or so sources as well. This is a bit daunting and I am toying with becoming a "lumper" and creating one source for the CSV list arising from that search, only the perfectionist in me is stopping me at the moment.

Just interested in what other one namers do when you have a large list - is lumping the answer?

regards
John Firr


Paul Sillitoe
 

Hi John

Whatever the dataset, I always find myself having to split it into granular detail in the end, to enable any sensible analysis.  In fact, I'm sitting here now trying to persuade myself that all the extra effort won't be necessary for a new task I'm about to start, but knowing in my heart that it will 😄

All best

Paul



Sent from my Samsung Galaxy smartphone but not so smart as to usefully auto-correct the typos from my large fingers 🙂


-------- Original message --------
From: "johnfirr via groups.io" <johnfirr@...>
Date: 04/05/2021 15:16 (GMT+00:00)
To: family-historian@groups.io
Subject: [family-historian] Lumper or Splitter for one name study

Just a question out of interest really. I started using FH for family history research and adopted wnat I believe to be a "splitter" technique for recording sources i.e I raise an individual source for every record for every individual. This has the advantage that it allows me to record lots of detail and its very specific when searching, however it is relatively slow as there is lots of data entry for just one event ( say a Birth index entry).
Last year I transferred into carrying out a one name study and have continued to "split" which has not been a huge problem since the main name is very rare so if wanted to enter say all of the births on Findmypast for that name in the UK since 1837 I only have a couple of hundred.

However I have now started on one of the variants which whilst not huge is an order of magnitude higher so for instance I have just pulled all of the UK births for that variant from Free BMD and have a list of over a thousand. Entering these using splitting means also originaltng 1500 or so sources as well. This is a bit daunting and I am toying with becoming a "lumper" and creating one source for the CSV list arising from that search, only the perfectionist in me is stopping me at the moment.

Just interested in what other one namers do when you have a large list - is lumping the answer?

regards
John Firr


David Wilkinson
 

John,

I lump for things like birth, death, marriage indexes and certificates, each census e.g. lumped for 1841, 1851 etc and record the detail in the "where within source", "text from source" and "note" fields, thus they become unique under a lumped heading. The key to me is adopt a strategy then stick to it.

If I split everything I would 100,000s of entries in the Source table which seems daft to me.

The purpose of a citation is the the reader can understand where the data came from and find it easily if they wish.

David Wilkinson

On 04/05/2021 14:57, johnfirr via groups.io wrote:
Just a question out of interest really. I started using FH for family history research and adopted wnat I believe to be a "splitter" technique for recording sources i.e I raise an individual source for every record for every individual. This has the advantage that it allows me to record lots of detail and its very specific when searching, however it is relatively slow as there is lots of data entry for just one event ( say a Birth index entry).
Last year I transferred into carrying out a one name study and have continued to "split" which has not been a huge problem since the main name is very rare so if wanted to enter say all of the births on Findmypast for that name in the UK since 1837 I only have a couple of hundred.

However I have now started on one of the variants which whilst not huge is an order of magnitude higher so for instance I have just pulled all of the UK births for that variant from Free BMD and have a list of over a thousand. Entering these using splitting means also originaltng 1500 or so sources as well. This is a bit daunting and I am toying with becoming a "lumper" and creating one source for the CSV list arising from that search, only the perfectionist in me is stopping me at the moment.

Just interested in what other one namers do when you have a large list - is lumping the answer?

regards
John Firr


Adrian Bruce
 

I don't do an ONS but I am an ardent splitter. However, my unscientific gut feeling is that
 - (a) virtually every splitter will actually lump **some** source records;
 - (b) the classic source records that we (including me) lump are BMD **indexes** such as FreeBMD or the GRO online indexes.

I have a single source record for all of the FreeBMD indexes, another for the GRO Online indexes, another for the Ancestry BMD indexes for England & Wales, one for CheshireBMD, one for LancashireBMD, etc, etc.

My reason was quite simple - I felt that the payload of repeated data in the source-record made it easier for me to reuse the same source record and the actual varying detail could easily go into the "citation" details. This gives only a couple of repeats for a citation for a single BMD index entry, unlike the number of repeats for a birth certificate, say, given that tends to support many more facts. (Normally I dislike repeats).

Adrian









Lorna Craig
 

Like Adrian, I am a splitter for most things but a ‘lumper’ for indexes, where the information is minimal and there are no images involved.  So for the GRO England and Wales indexes I have one source for the births index, one for marriages and one for deaths. (Unlike Adrian I don’t distinguish which website I searched the index on).  The dates and index references go in the citations.

 

Only when I obtain a more detailed source, usually a certificate, do I create a separate ‘split’ source. An image of the certificate is attached to the split source.

 

Lorna

 

From: Adrian Bruce
Sent: 04 May 2021 16:10
To: Family Historian Groups.io mailing list
Subject: Re: [family-historian] Lumper or Splitter for one name study

 

I don't do an ONS but I am an ardent splitter. However, my unscientific gut feeling is that

 - (a) virtually every splitter will actually lump **some** source records;

 - (b) the classic source records that we (including me) lump are BMD **indexes** such as FreeBMD or the GRO online indexes.

 

 

 

 

 

 

 

 


johnfirr@...
 

Thanks everyone,
that confirms where i think I am. I wouldnt want to do anything other than split where I have a specific source such as a certificate and image but it does seem to feel like splitting an index is perhaps a step too far so interesting that others have taken this approach.
John F.


David Hodgson-Brown
 

Hi All

 

Just another perspective. I lump items if I am going to use it for 1 fact/event such as Birth/Death/Marriage. If I am going to use it to support multiple facts/events such as a residence record which is attached to multiple people for a census then I have create a source using the Census Source. This is so I can alter only one record and it will affect all the people.

 

Regards

 

David

 

From: family-historian@groups.io <family-historian@groups.io> On Behalf Of johnfirr via groups.io
Sent: 04 May 2021 16:50
To: family-historian@groups.io
Subject: Re: [family-historian] Lumper or Splitter for one name study

 

Thanks everyone,
that confirms where i think I am. I wouldnt want to do anything other than split where I have a specific source such as a certificate and image but it does seem to feel like splitting an index is perhaps a step too far so interesting that others have taken this approach.
John F.